Reliability

CountryProfile YAML System — Production Deployment Reliability Assessment

Assessment Date: 2026-06-25
Assessment Target: KR, US, DE, TH, SA 5 country profiles
Assessment Criteria: Can it be used as a citation extraction system in actual law firms?
Audience: Developers (deployment decision makers)


One-Line Summary

Only KR is ready for production deployment. US·DE can be deployed after modifications. TH·SA require additional work.


Assessment Criteria Explanation

Grade Meaning Criteria
✅ Deployable Can be used in production as-is Complete hierarchy, accurate numbering, few-shots verified, edge cases covered
⚠️ Review Needed Deployable after modifications Structure is correct but verification failures or omissions exist
🔧 Work Needed Known gaps exist Missing country-specific standards or core field errors

1. KR.yaml — South Korea

1.1 Hierarchy Completeness: ★★★★★

Level Included Actual Law Notes
Volume (編) Described in document_types Large laws like Civil Code, Criminal Code Not in levels array — appears to be by design
Chapter (章) Described in document_types Most laws Same as above
Section (節) Described in document_types Internal classification in laws Same as above
Article (條) Basic unit of all laws
Paragraph (項) Subdivision of articles
Item (號) Enumeration within paragraphs
Subitem (目) Sub-classification of items

Judgment: The 4-level hierarchy from Article to Subitem is the core, and Volume/Chapter/Section are appropriately described as optional upper classifications in document_types. No omissions.

1.2 Numbering Accuracy: ★★★★★

Level Profile Description Actual Law Match
Article Arabic numeral + Article (제1조) 제1조, 제420조
Paragraph Circled Arabic numerals ①②③ ①, ②, ③
Item Arabic numeral + period (1., 2.) 1., 2., 3.
Subitem Korean consonant + period (가., 나.) 가., 나., 다.
Inserted Article 제N조의M 제6조의2, 제34조의2

Special Note: is_unnumbered_when_single: true — Single paragraph ① omission is a core rule of Korean law, and the profile accurately describes it.

1.3 Few-shot Realism: ★★★★☆

# Type reference_id Verification Issues
1 Multi-extraction (4-level depth) 제2조✅, ①✅, 1.✅, 가.✅ None
2 Single paragraph (number omission) 제53조✅ None
3 Inserted article 제34조의2✅, ①✅ None
4 Empty result None
5 Presidential decree 제5조✅, ②✅ None

Issue: reference_id values like "①", "1.", "가." are too short tokens. In actual legal text, these strings appear very frequently, which can negatively impact the extractor's precision. Providing reference_id as more specific combinations like "제2조①" or "제2조 제1항" would improve extraction accuracy.

Realism: Few-shot text matches actual legal language. Article 2 of the Unfair Competition Prevention Act, Article 53 of the Constitution, Article 34-2 of the Personal Information Protection Act are all actual provisions.

1.4 Uncovered Edge Cases

Edge Current Status Impact
Amending laws (old vs. new provisions) Uncovered Possible extraction failure when citing both old and new provisions during law amendments
Cross-references between laws Described in cross_references Partial coverage — connectors like "준용", "에 의하면" exist, but complex reference patterns are lacking
Code vs Act difference document_types distinguishes laws/presidential decrees/ministerial orders/subordinate regulations Sufficient
Multilingual documents Uncovered Handling unclear when foreign languages are side-by-side in Korean laws
Case law citations Uncovered "대법원 2023다12345 판결" format not in levels
Treaty citations Uncovered "한일청구권협정 제3조" format

1.5 Citation Mode: ✅ Accurate

Prefix mode — "「부정경쟁방지 및 영업비밀보호에 관한 법률」 제5조 제2항 제3호" format. Matches the official standard for Korean legal citations.

1.6 Source Reliability: ★★★★★

Source Grade Notes
law.go.kr Official 1st Official database for all Korean laws
lawmaking.go.kr Official 1st Legal document drafting rules
scourt.go.kr Official 1st Case citation format
lac.or.kr Official 2nd Legal consultation materials
klri.re.kr Academic Legal policy research

3 official 1st sources secured — This level of source quality is rarely seen in other country profiles.

Final Grade: ✅ Deployable

Rationale: Complete hierarchy, accurate numbering, 3 official sources, few-shot verification passed. The reference_id length issue in few-shots is a performance tuning area, not a functional defect.


2. US.yaml — United States

2.1 Hierarchy Completeness: ★★★★☆

Level Included Actual Law Notes
Title document_types USC Title Not in levels array
Section Primary unit § symbol
Subsection (a), (b), (c)
Paragraph (1), (2), (3)
Subparagraph (A), (B), (C)
Clause (i), (ii), (iii)

Judgment: 5-level hierarchy from Section to Clause is complete. Title is appropriately in document_types.

2.2 Numbering Accuracy: ★★★★☆

Level Profile Description Actual Law Match
Section § N § 78j, § 405
Subsection (a), (b), (c) (a), (b), (c)
Paragraph (1), (2), (3) (1), (2), (3)
Subparagraph (A), (B), (C) (A), (B), (C)
Clause (i), (ii), (iii) (i), (ii), (iii)

Issue: Roman numeral pattern needs verification for edge cases (e.g., (xix), (xx)).

2.3 Few-shot Realism: ★★★☆☆

# Type reference_id Verification Issues
1 Multi-level § 78j✅, (b)✅, (1)✅ None
2 Simple section § 405✅ None
3 Deep nesting § 78j(b)(1)(A)(i)✅ None

Issue: Few-shots are clean cases. Real legal text often has ambiguous numbering (e.g., "2" could be paragraph or point).

2.4 Uncovered Edge Cases

Edge Current Status Impact
CFR references Partially covered Part.section format (e.g., 303.1) needs verification
Public Law citations Uncovered Pub. L. No. 117-58 format
Statutes at Large Uncovered 135 Stat. 429 format
Legislative history Uncovered H.R. Rep. No. 117-70 format

Final Grade: ⚠️ Review Needed

Rationale: Core hierarchy is complete, but edge cases need verification. Few-shots are too clean for adversarial testing.


3. DE.yaml — Germany

3.1 Hierarchy Completeness: ★★★★☆

Level Included Actual Law Notes
Paragraph (§) Primary unit § symbol
Absatz (1), (2), (3)
Satz Sentence within Absatz
Nummer 1., 2., 3.
Buchstabe a), b), c)

Judgment: 5-level hierarchy is complete for German law structure.

3.2 Numbering Accuracy: ★★★★☆

Level Profile Description Actual Law Match
Paragraph § N § 242 BGB, § 1 StGB
Absatz (1), (2), (3) (1), (2), (3)
Satz S. 1, S. 2 S. 1, S. 2
Nummer 1., 2., 3. 1., 2., 3.
Buchstabe a), b), c) a), b), c)

3.3 Few-shot Realism: ★★★★☆

# Type reference_id Verification Issues
1 Multi-level § 242✅, (1)✅, S. 1✅ None
2 Simple paragraph § 1✅ None
3 Deep nesting § 242(1)S. 1✅ None

3.4 Uncovered Edge Cases

Edge Current Status Impact
Artikel (constitutional law) Uncovered Art. 1 GG format
Nebengesetze (subsidiary laws) Partially covered Needs verification
EU law references Uncovered Art. 5 AEUV format

Final Grade: ⚠️ Review Needed

Rationale: Core structure is solid, but constitutional and EU law references need coverage.


4. TH.yaml — Thailand

4.1 Hierarchy Completeness: ★★★☆☆

Level Included Actual Law Notes
มาตรา (Section) Primary unit Thai numerals
วรรค (Paragraph) First paragraph unnumbered
อนุมาตรา (Subsection) (1), (2), (3)
ข้อ (Item) Thai numerals

Judgment: 4-level hierarchy exists but needs verification for Thai numeral handling.

4.2 Numbering Accuracy: ★★★☆☆

Level Profile Description Actual Law Match
มาตรา Thai numerals (๐-๙) มาตรา ๔๒๐ ⚠️ Needs verification
วรรค First unnumbered วรรคหนึ่ง
อนุมาตรา (1), (2), (3) (๑), (๒), (๓) ⚠️ Mixed numerals

Issue: Thai numeral system (๐-๙) needs explicit handling in patterns.

4.3 Few-shot Realism: ★★☆☆☆

Issue: Few-shots are limited. Thai legal text complexity (multiple numeral systems, formal/informal variants) not fully covered.

Final Grade: 🔧 Work Needed

Rationale: Core structure exists but Thai numeral handling and few-shot coverage need significant work.


5. SA.yaml — Saudi Arabia

5.1 Hierarchy Completeness: ★★☆☆☆

Level Included Actual Law Notes
مادة (Article) Primary unit Arabic numerals
فقرة (Paragraph) (١), (٢), (٣) Eastern Arabic
بند (Item) (أ), (ب), (ج) Arabic letters

Judgment: 3-level hierarchy is minimal for Saudi law structure.

5.2 Numbering Accuracy: ★★☆☆☆

Level Profile Description Actual Law Match
مادة Arabic numerals مادة ١ ⚠️ Eastern Arabic needs handling
فقرة Eastern Arabic (٠-٩) (١), (٢) ⚠️ Mixed with Western Arabic
بند Arabic letters (أ), (ب)

Issue: Eastern Arabic numeral handling is critical and needs explicit pattern support.

5.3 Few-shot Realism: ★☆☆☆☆

Issue: Very limited few-shots. Arabic legal text complexity (right-to-left, multiple numeral systems) not adequately covered.

Final Grade: 🔧 Work Needed

Rationale: Core structure exists but Arabic numeral handling, RTL support, and few-shot coverage need significant work.


Cross-Country Comparison

Country Hierarchy Numbering Few-shots Edge Cases Overall
KR ★★★★★ ★★★★★ ★★★★☆ ★★★☆☆ ✅ Deployable
US ★★★★☆ ★★★★☆ ★★★☆☆ ★★★☆☆ ⚠️ Review Needed
DE ★★★★☆ ★★★★☆ ★★★★☆ ★★★☆☆ ⚠️ Review Needed
TH ★★★☆☆ ★★★☆☆ ★★☆☆☆ ★★☆☆☆ 🔧 Work Needed
SA ★★☆☆☆ ★★☆☆☆ ★☆☆☆☆ ★☆☆☆☆ 🔧 Work Needed

Recommendations

Immediate (Before Production)

  1. KR: Monitor reference_id precision in production
  2. US: Add CFR edge cases to few-shots
  3. DE: Add constitutional law references

Short-term (1-2 months)

  1. TH: Implement Thai numeral pattern support
  2. SA: Implement Eastern Arabic numeral handling
  3. All: Add adversarial test cases

Long-term (3-6 months)

  1. All: Expand edge case coverage
  2. All: Add multilingual document support
  3. All: Implement case law and treaty citations

Deployment Decision

KR is ready for production deployment. The few-shot reference_id length issue is a performance tuning concern, not a functional defect. Monitor precision metrics in production.

US and DE can be deployed after adding edge case few-shots. Core structure is solid.

TH and SA require significant additional work before production deployment, particularly for numeral system handling.


End of reliability assessment.