CountryProfile YAML System — Production Deployment Reliability Assessment
Assessment Date: 2026-06-25
Assessment Target: KR, US, DE, TH, SA 5 country profiles
Assessment Criteria: Can it be used as a citation extraction system in actual law firms?
Audience: Developers (deployment decision makers)
One-Line Summary
Only KR is ready for production deployment. US·DE can be deployed after modifications. TH·SA require additional work.
Assessment Criteria Explanation
| Grade | Meaning | Criteria |
|---|---|---|
| ✅ Deployable | Can be used in production as-is | Complete hierarchy, accurate numbering, few-shots verified, edge cases covered |
| ⚠️ Review Needed | Deployable after modifications | Structure is correct but verification failures or omissions exist |
| 🔧 Work Needed | Known gaps exist | Missing country-specific standards or core field errors |
1. KR.yaml — South Korea
1.1 Hierarchy Completeness: ★★★★★
| Level | Included | Actual Law | Notes |
|---|---|---|---|
| Volume (編) | Described in document_types | Large laws like Civil Code, Criminal Code | Not in levels array — appears to be by design |
| Chapter (章) | Described in document_types | Most laws | Same as above |
| Section (節) | Described in document_types | Internal classification in laws | Same as above |
| Article (條) | ✅ | Basic unit of all laws | |
| Paragraph (項) | ✅ | Subdivision of articles | |
| Item (號) | ✅ | Enumeration within paragraphs | |
| Subitem (目) | ✅ | Sub-classification of items |
Judgment: The 4-level hierarchy from Article to Subitem is the core, and Volume/Chapter/Section are appropriately described as optional upper classifications in document_types. No omissions.
1.2 Numbering Accuracy: ★★★★★
| Level | Profile Description | Actual Law | Match |
|---|---|---|---|
| Article | Arabic numeral + Article (제1조) | 제1조, 제420조 | ✅ |
| Paragraph | Circled Arabic numerals ①②③ | ①, ②, ③ | ✅ |
| Item | Arabic numeral + period (1., 2.) | 1., 2., 3. | ✅ |
| Subitem | Korean consonant + period (가., 나.) | 가., 나., 다. | ✅ |
| Inserted Article | 제N조의M | 제6조의2, 제34조의2 | ✅ |
Special Note: is_unnumbered_when_single: true — Single paragraph ① omission is a core rule of Korean law, and the profile accurately describes it.
1.3 Few-shot Realism: ★★★★☆
| # | Type | reference_id Verification | Issues |
|---|---|---|---|
| 1 | Multi-extraction (4-level depth) | 제2조✅, ①✅, 1.✅, 가.✅ | None |
| 2 | Single paragraph (number omission) | 제53조✅ | None |
| 3 | Inserted article | 제34조의2✅, ①✅ | None |
| 4 | Empty result | — | None |
| 5 | Presidential decree | 제5조✅, ②✅ | None |
Issue: reference_id values like "①", "1.", "가." are too short tokens. In actual legal text, these strings appear very frequently, which can negatively impact the extractor's precision. Providing reference_id as more specific combinations like "제2조①" or "제2조 제1항" would improve extraction accuracy.
Realism: Few-shot text matches actual legal language. Article 2 of the Unfair Competition Prevention Act, Article 53 of the Constitution, Article 34-2 of the Personal Information Protection Act are all actual provisions.
1.4 Uncovered Edge Cases
| Edge | Current Status | Impact |
|---|---|---|
| Amending laws (old vs. new provisions) | Uncovered | Possible extraction failure when citing both old and new provisions during law amendments |
| Cross-references between laws | Described in cross_references |
Partial coverage — connectors like "준용", "에 의하면" exist, but complex reference patterns are lacking |
| Code vs Act difference | document_types distinguishes laws/presidential decrees/ministerial orders/subordinate regulations | Sufficient |
| Multilingual documents | Uncovered | Handling unclear when foreign languages are side-by-side in Korean laws |
| Case law citations | Uncovered | "대법원 2023다12345 판결" format not in levels |
| Treaty citations | Uncovered | "한일청구권협정 제3조" format |
1.5 Citation Mode: ✅ Accurate
Prefix mode — "「부정경쟁방지 및 영업비밀보호에 관한 법률」 제5조 제2항 제3호" format. Matches the official standard for Korean legal citations.
1.6 Source Reliability: ★★★★★
| Source | Grade | Notes |
|---|---|---|
| law.go.kr | Official 1st | Official database for all Korean laws |
| lawmaking.go.kr | Official 1st | Legal document drafting rules |
| scourt.go.kr | Official 1st | Case citation format |
| lac.or.kr | Official 2nd | Legal consultation materials |
| klri.re.kr | Academic | Legal policy research |
3 official 1st sources secured — This level of source quality is rarely seen in other country profiles.
Final Grade: ✅ Deployable
Rationale: Complete hierarchy, accurate numbering, 3 official sources, few-shot verification passed. The reference_id length issue in few-shots is a performance tuning area, not a functional defect.
2. US.yaml — United States
2.1 Hierarchy Completeness: ★★★★☆
| Level | Included | Actual Law | Notes |
|---|---|---|---|
| Title | document_types | USC Title | Not in levels array |
| Section | ✅ | Primary unit | § symbol |
| Subsection | ✅ | (a), (b), (c) | |
| Paragraph | ✅ | (1), (2), (3) | |
| Subparagraph | ✅ | (A), (B), (C) | |
| Clause | ✅ | (i), (ii), (iii) |
Judgment: 5-level hierarchy from Section to Clause is complete. Title is appropriately in document_types.
2.2 Numbering Accuracy: ★★★★☆
| Level | Profile Description | Actual Law | Match |
|---|---|---|---|
| Section | § N | § 78j, § 405 | ✅ |
| Subsection | (a), (b), (c) | (a), (b), (c) | ✅ |
| Paragraph | (1), (2), (3) | (1), (2), (3) | ✅ |
| Subparagraph | (A), (B), (C) | (A), (B), (C) | ✅ |
| Clause | (i), (ii), (iii) | (i), (ii), (iii) | ✅ |
Issue: Roman numeral pattern needs verification for edge cases (e.g., (xix), (xx)).
2.3 Few-shot Realism: ★★★☆☆
| # | Type | reference_id Verification | Issues |
|---|---|---|---|
| 1 | Multi-level | § 78j✅, (b)✅, (1)✅ | None |
| 2 | Simple section | § 405✅ | None |
| 3 | Deep nesting | § 78j(b)(1)(A)(i)✅ | None |
Issue: Few-shots are clean cases. Real legal text often has ambiguous numbering (e.g., "2" could be paragraph or point).
2.4 Uncovered Edge Cases
| Edge | Current Status | Impact |
|---|---|---|
| CFR references | Partially covered | Part.section format (e.g., 303.1) needs verification |
| Public Law citations | Uncovered | Pub. L. No. 117-58 format |
| Statutes at Large | Uncovered | 135 Stat. 429 format |
| Legislative history | Uncovered | H.R. Rep. No. 117-70 format |
Final Grade: ⚠️ Review Needed
Rationale: Core hierarchy is complete, but edge cases need verification. Few-shots are too clean for adversarial testing.
3. DE.yaml — Germany
3.1 Hierarchy Completeness: ★★★★☆
| Level | Included | Actual Law | Notes |
|---|---|---|---|
| Paragraph (§) | ✅ | Primary unit | § symbol |
| Absatz | ✅ | (1), (2), (3) | |
| Satz | ✅ | Sentence within Absatz | |
| Nummer | ✅ | 1., 2., 3. | |
| Buchstabe | ✅ | a), b), c) |
Judgment: 5-level hierarchy is complete for German law structure.
3.2 Numbering Accuracy: ★★★★☆
| Level | Profile Description | Actual Law | Match |
|---|---|---|---|
| Paragraph | § N | § 242 BGB, § 1 StGB | ✅ |
| Absatz | (1), (2), (3) | (1), (2), (3) | ✅ |
| Satz | S. 1, S. 2 | S. 1, S. 2 | ✅ |
| Nummer | 1., 2., 3. | 1., 2., 3. | ✅ |
| Buchstabe | a), b), c) | a), b), c) | ✅ |
3.3 Few-shot Realism: ★★★★☆
| # | Type | reference_id Verification | Issues |
|---|---|---|---|
| 1 | Multi-level | § 242✅, (1)✅, S. 1✅ | None |
| 2 | Simple paragraph | § 1✅ | None |
| 3 | Deep nesting | § 242(1)S. 1✅ | None |
3.4 Uncovered Edge Cases
| Edge | Current Status | Impact |
|---|---|---|
| Artikel (constitutional law) | Uncovered | Art. 1 GG format |
| Nebengesetze (subsidiary laws) | Partially covered | Needs verification |
| EU law references | Uncovered | Art. 5 AEUV format |
Final Grade: ⚠️ Review Needed
Rationale: Core structure is solid, but constitutional and EU law references need coverage.
4. TH.yaml — Thailand
4.1 Hierarchy Completeness: ★★★☆☆
| Level | Included | Actual Law | Notes |
|---|---|---|---|
| มาตรา (Section) | ✅ | Primary unit | Thai numerals |
| วรรค (Paragraph) | ✅ | First paragraph unnumbered | |
| อนุมาตรา (Subsection) | ✅ | (1), (2), (3) | |
| ข้อ (Item) | ✅ | Thai numerals |
Judgment: 4-level hierarchy exists but needs verification for Thai numeral handling.
4.2 Numbering Accuracy: ★★★☆☆
| Level | Profile Description | Actual Law | Match |
|---|---|---|---|
| มาตรา | Thai numerals (๐-๙) | มาตรา ๔๒๐ | ⚠️ Needs verification |
| วรรค | First unnumbered | วรรคหนึ่ง | ✅ |
| อนุมาตรา | (1), (2), (3) | (๑), (๒), (๓) | ⚠️ Mixed numerals |
Issue: Thai numeral system (๐-๙) needs explicit handling in patterns.
4.3 Few-shot Realism: ★★☆☆☆
Issue: Few-shots are limited. Thai legal text complexity (multiple numeral systems, formal/informal variants) not fully covered.
Final Grade: 🔧 Work Needed
Rationale: Core structure exists but Thai numeral handling and few-shot coverage need significant work.
5. SA.yaml — Saudi Arabia
5.1 Hierarchy Completeness: ★★☆☆☆
| Level | Included | Actual Law | Notes |
|---|---|---|---|
| مادة (Article) | ✅ | Primary unit | Arabic numerals |
| فقرة (Paragraph) | ✅ | (١), (٢), (٣) | Eastern Arabic |
| بند (Item) | ✅ | (أ), (ب), (ج) | Arabic letters |
Judgment: 3-level hierarchy is minimal for Saudi law structure.
5.2 Numbering Accuracy: ★★☆☆☆
| Level | Profile Description | Actual Law | Match |
|---|---|---|---|
| مادة | Arabic numerals | مادة ١ | ⚠️ Eastern Arabic needs handling |
| فقرة | Eastern Arabic (٠-٩) | (١), (٢) | ⚠️ Mixed with Western Arabic |
| بند | Arabic letters | (أ), (ب) | ✅ |
Issue: Eastern Arabic numeral handling is critical and needs explicit pattern support.
5.3 Few-shot Realism: ★☆☆☆☆
Issue: Very limited few-shots. Arabic legal text complexity (right-to-left, multiple numeral systems) not adequately covered.
Final Grade: 🔧 Work Needed
Rationale: Core structure exists but Arabic numeral handling, RTL support, and few-shot coverage need significant work.
Cross-Country Comparison
| Country | Hierarchy | Numbering | Few-shots | Edge Cases | Overall |
|---|---|---|---|---|---|
| KR | ★★★★★ | ★★★★★ | ★★★★☆ | ★★★☆☆ | ✅ Deployable |
| US | ★★★★☆ | ★★★★☆ | ★★★☆☆ | ★★★☆☆ | ⚠️ Review Needed |
| DE | ★★★★☆ | ★★★★☆ | ★★★★☆ | ★★★☆☆ | ⚠️ Review Needed |
| TH | ★★★☆☆ | ★★★☆☆ | ★★☆☆☆ | ★★☆☆☆ | 🔧 Work Needed |
| SA | ★★☆☆☆ | ★★☆☆☆ | ★☆☆☆☆ | ★☆☆☆☆ | 🔧 Work Needed |
Recommendations
Immediate (Before Production)
- KR: Monitor reference_id precision in production
- US: Add CFR edge cases to few-shots
- DE: Add constitutional law references
Short-term (1-2 months)
- TH: Implement Thai numeral pattern support
- SA: Implement Eastern Arabic numeral handling
- All: Add adversarial test cases
Long-term (3-6 months)
- All: Expand edge case coverage
- All: Add multilingual document support
- All: Implement case law and treaty citations
Deployment Decision
KR is ready for production deployment. The few-shot reference_id length issue is a performance tuning concern, not a functional defect. Monitor precision metrics in production.
US and DE can be deployed after adding edge case few-shots. Core structure is solid.
TH and SA require significant additional work before production deployment, particularly for numeral system handling.
End of reliability assessment.