paint-brush
Training AI to Understand Legal Texts in Different Domainsby@instancing

Training AI to Understand Legal Texts in Different Domains

by Instancing
Instancing HackerNoon profile picture

Instancing

@instancing

Pioneering instance management, driving innovative solutions for efficient resource utilization,...

April 2nd, 2025
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This section evaluates how rhetorical role classifiers generalize across domains, showing that methods like prototypical learning and discourse-aware contrastive learning improve cross-domain performance. Prototypical models prove robust in preventing overfitting to domain-specific features.
featured image - Training AI to Understand Legal Texts in Different Domains
1x
Read by Dr. One voice-avatar

Listen to this story

Instancing HackerNoon profile picture
Instancing

Instancing

@instancing

Pioneering instance management, driving innovative solutions for efficient resource utilization, and enabling a more sus

About @instancing
LEARN MORE ABOUT @INSTANCING'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Abstract and 1. Introduction

  1. Related Work

  2. Task, Datasets, Baseline

  3. RQ 1: Leveraging the Neighbourhood at Inference

    4.1. Methods

    4.2. Experiments

  4. RQ 2: Leveraging the Neighbourhood at Training

    5.1. Methods

    5.2. Experiments

  5. RQ 3: Cross-Domain Generalizability

  6. Conclusion

  7. Limitations

  8. Ethics Statement

  9. Bibliographical References

6. RQ 3: Cross-Domain Generalizability

To evaluate how well our proposed methods can transfer across different domains, we train the model on one dataset (source) and assess its performance on other datasets (target) in a blind zeroshot manner. We use the Paheli, M-CL, and M-IT datasets, which span diverse domains but share a same 7 rhetorical label space. The resulting MacroF1 scores are presented in Table 3.


Naturally, models trained and tested on the same domain outperform those trained on different domains (e.g., baseline model trained and tested on Paheli achieves a Macro-F1 of 62.43, whereas trained on M-CL and tested on Paheli achieves 54.71). Interestingly, the baseline model shows an ability to transfer knowledge from one domain to another, outperforming random[1] guessing across all datasets. While discourse-aware contrastive model improves in-domain performance, it marginally reduces cross-domain performance across all datasets compared to the baseline (e.g., Disc. Contr. trained on M-CL and tested on Paheli achieves a Macro-F1 of 54.04, while the baseline with the same setup achieves 54.71). This can be attributed to the model capturing domainspecific features while minimizing distances between similar instances in contrastive learning. In contrast, single and multi-prototypical models enhance cross-domain transfer compared to the baseline, except when trained on M-IT. This indicates prototypical learning acts as a more robust guiding point, preventing overfitting to noisy neighbors as in contrastive models. Between the two, single prototype tend to perform better, due to its single representation being agnostic to domain-specific variations and encapsulating core characteristics, making it more adept in cross-domain scenarios. Furthermore, coupling discourse-aware contrastive with prototypical models boosts cross-domain performance, except when trained on M-IT. This behaviour of M-IT may be attributed to marginal indomain improvements, leading to overfitting on domain-specific features limiting cross-domain generalization. This prompts questions about selection of optimal source dataset for improved performance on target datasets, warranting further investigation. For instance, to test on Paheli with baseline, training on M-CL yields a Macro-F1 of 54.71, while on M-IT yields 52.97. Additionally, exploring joint training with multiple datasets could shed light on their impact on both in-domain source and unseen target datasets.


7. Conclusion

In this paper, we have demonstrated the potential for enhancing the performance of rhetorical role classifiers by leveraging knowledge from neighbours, semantically similar instances. Interpolation with kNN and multiple prototypes at the inference time have shown promising improvements, especially in addressing the challenging issue of label imbalance, without requiring re-training. Additionally, our approach of incorporating neighbourhood constraints during training with our proposed discourse-aware contrastive learning and prototypical learning has demonstrated improvements. Combining both methods has boosted it further, indicating their complementary nature. Notably, the prototypical methods have proven to be robust, showcasing performance gains even in cross-domain scenarios, generalizing beyond the domains they were trained on.



Authors:

(1) Santosh T.Y.S.S, School of Computation, Information, and Technology; Technical University of Munich, Germany (santosh.tokala@tum.de);

(2) Hassan Sarwat, School of Computation, Information, and Technology; Technical University of Munich, Germany (hassan.sarwat@tum.de);

(3) Ahmed Abdou, School of Computation, Information, and Technology; Technical University of Munich, Germany (ahmed.abdou@tum.de);

(4) Matthias Grabmair, School of Computation, Information, and Technology; Technical University of Munich, Germany (matthias.grabmair@tum.de).


This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

[1] Random choices are based on the training set’s label distribution (uniform distribution lead to further lower scores). These are averaged over 10 runs.

L O A D I N G
. . . comments & more!

About Author

Instancing HackerNoon profile picture
Instancing@instancing
Pioneering instance management, driving innovative solutions for efficient resource utilization, and enabling a more sus

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Hackernoon
X
Threads
Bsky