Citation Survey

Legal Citation Hierarchy Survey — 12 Countries

2026-06-24 | Foundation survey for dynamic schema inference design


Hierarchy Comparison Table (Article level and below)

Country Mode Level 1 Level 2 Level 3 Level 4 Level 5 Level 6+ Depth
KR prefix Article 제N조 Paragraph ①②③ Item 1. 2. 3. Subitem 가. 나. 다. 1) 2) 3)가)(1)(가) 4+4
JP prefix Article 第N条 Paragraph N (first unnumbered) Item (一)(二) Subitem イ, ロ, ハ (1)(2)(3) 5
CN prefix Article 第N条 Paragraph unnumbered (position) Item (一)(二) Subitem 1. 2. 4
DE prefix § § N / Art. Paragraph (1)(2) Sentence S. 1 Number Nr. 1 Letter a) Double letter aa) 6
FR prefix Article Art. N Paragraph unnumbered (order) 1° 2° 3° (degree) Dash Letter a) b) 5
IT prefix Article Art. N Paragraph comma N Letter a) b) Number 1) 2) 4
US paren § § N Subsection (a) Paragraph (1) Subparagraph (A) Clause (i) Subclause (I) → Item (aa) → Subitem (AA) 8
GB paren Section s N Subsection (1) Paragraph (a) Subparagraph (i) 4
AU paren Section s N Subsection (1) Paragraph (a) Subparagraph (i) 4
BR prefix Article Art. N Paragraph § N / Inciso I, II, III Alínea a) b) Item 1 2 3 5
RU prefix Article ст. N Part ч. N Point п. N Subpoint подп. / а) б) Paragraph unnumbered (order) 5
IN mixed Section S. N Sub-section (1) Clause (a) Proviso "Provided that" Explanation Illustration 5+

Pattern Analysis

1. Citation Modes

Pattern Countries Characteristics
Prefix KR, JP, CN, DE, FR, IT, BR, RU Each level has unique prefix (제/第/§/Art./Art./art./ст.)
Parenthetical US, GB, AU (a)(1)(A)(i) nested form for depth
Mixed IN Section prefix + internal parenthetical nesting

2. Unnumbered Levels

Country Level Description
KR Paragraph ① omitted when single paragraph
JP Paragraph First paragraph is unnumbered
CN Paragraph Always unnumbered — identified by position only
FR Paragraph Always unnumbered — identified by order only
RU Paragraph Always unnumbered — identified by order only

→ Extraction system must interpret unnumbered levels by position

3. Inserted Article Numbers

Country Pattern Example
KR 의N 제7조의2
JP のN 第8条の2
DE Na/Nb § 123a
FR bis/ter/quater Article 9 bis
IT -bis Art. 5-bis
ES -bis Artículo 5 bis
RU .N ст. 104.1

→ Extraction system must handle inserted article patterns

4. Cross-References

Type Examples Handling
Self-reference 같은 법, this section, ledit article Direct extraction
Adjacent reference 前条, 前項, the preceding section Context-dependent
External reference 위 법률, the said Act, such subsection Requires resolution

→ Extraction system must distinguish reference types

5. Special Cases

Country Case Description
US CFR cycling (a)(1)(i)(A)(1)(i) — same pattern repeats after 3 levels
JP Kanji numerals 一, 二, 三 for formal text; 1, 2, 3 for casual
KR Hangul subitems 가, 나, 다 for subitem level
CN Fullwidth parentheses () instead of ()
TH Thai numerals ๐-๙ for official documents

Recommendations for Pattern Language

1. Support Multiple Numeral Systems

  • Arabic: 1, 2, 3
  • Roman: I, II, III
  • CJK: 一, 二, 三
  • Circled: ①, ②, ③
  • Thai: ๐, ๑, ๒

2. Handle Unnumbered Levels

  • Detect by position (first paragraph after article header)
  • Use is_unnumbered_first flag in CountryProfile
  • Canonical form: paragraph:1 or paragraph:single

3. Support Inserted Articles

  • Country-specific suffix patterns
  • Canonical form: article:34-2 for 제34조의2
  • Multiple insertion levels: article:15-2-2 for 第15条の2の2

4. Cross-Reference Resolution

  • Detect reference type (self, adjacent, external)
  • Use cross_references field in CountryProfile
  • LLM Judge for ambiguous cases

Conclusion

The 12-country survey reveals:

  1. 4 citation modes (prefix, parenthetical, mixed, suffix)
  2. 5 unnumbered level patterns across civil law systems
  3. 7 inserted article patterns requiring special handling
  4. Multiple numeral systems (Arabic, Roman, CJK, Thai, etc.)

The pattern language must support all these variations while maintaining a simple canonical form for normalization.


End of legal citation survey.