In addition to the information in this document, there is a spreadsheet listing each column used in this conversion.
- Module Uri: http://exist-db.org/apps/cbdb-data/templates
- Author: Duncan Paterson
- Version: 0.7
declare function app:test($node as node(), $model as map) as item()*
This is a sample templating function. It will be called by the templating module if it encounters an HTML element with an attribute: data-template="app:test" or class="app:test" (deprecated). The function has to take 2 default parameters. Additional parameters are automatically mapped to any matching request or function parameter.
- $node - the HTML node with the attribute which triggered this call
- $model - a map containing arbitrary data - used to pass information between template calls.
- Module Uri: http://exist-db.org/apps/cbdb-data/bibliography
The bibliography module transforms core bibliographic data from CBDB into TEI. It's output references the taxonomy generated by the genre module. It's elements are frequently referenced by source attributes across the whole data set.
- Author: Duncan Paterson
- Version: 0.7
- see
declare function bib:bibl-dates($dates as node()*, $type as xs:string?) as node()*
bib:bibl-dates reads the two principle date references in TEXT_CODE: original and published. This function resolves the relations of these dates expecting a valid no:c_textid. It returns both english and Chinese dates, referring to chal_ZH.xml.
- $dates - is a c_textid
- $type - can take either 'ori' for original, or 'pub' for published dates.
- date normalizes the distinction between 'during' and "around" to "when"
declare function bib:bibliography($texts as node()*, $mode as xs:string?) as item()*
This function reads the entities in TEXT_CODES sic
and generates corresponding bibl elements, joining data from TEXT_DATA, TEXT_BIBLCAT_CODES, TEXT_TYPE, EXTANT_CODES, and COUNTRY_CODES.
- $texts - is a c_textid
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
<bibl id="BIB...">...</bib>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/global |
global:create-mod-by |
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
http://exist-db.org/apps/cbdb-data/bibliography |
bib:roles |
http://exist-db.org/apps/cbdb-data/bibliography |
bib:bibl-dates |
declare function bib:roles($roles as node()*) as node()*
bib:roles reads c_role_id from TEXT_DATA, and TEXT_ROLE_CODES to transform into matching TEI elements. It simplifies c_role_id[. = 11] 'work included in'
to contributor
. it currently drops the Chinese terms from ``$TEXT_ROLE_CODES//no:c_role_desc_chn`. These could be added back in later via a ODD.
- $roles - is a c_role_id
- author, editor, or publisher with pointers to listPerson.
- Module Uri: http://exist-db.org/apps/cbdb-data/biographies
The biographies module transforms core person, and relationship data from CBDB in TEI. The data is stored inside a nested heirarchy of collections and sub-collections linked by xInclude statements.
- Author: Duncan Paterson
- Version: 0.7
- biog:alias
- biog:asso
- biog:biog
- biog:entry
- biog:event
- biog:inst-add
- biog:kin
- biog:name
- biog:new-post
- biog:pers-add
- biog:posses
- biog:status
- biog:write
declare function biog:alias($person as node()*) as node()*
biog:alias outputs aliases, such as pen-names, reign titles, from ALTNAME_DATA, and ALTNAME_CODES.
- $person - is a
c_personid
<persName type = "alias">...<person>
declare function biog:asso($ego as node()*) as node()*
biog:asso constructs a network of association relations from: ASSOC_DATA, ASSOC_CODES, ASSOC_TYPES, and ASSOC_CODE_TYPE_REL. The distance measured by c_assoc_range
is dropped. Annotations from: SCHOLARLYTOPIC_CODES, OCCASION_CODES, and LITERARYGENRE_CODES. The output's structure should match biog:kin's.
- $ego - is a
c_personid
<relation>...</relation>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:biog($persons as node()*, $mode as xs:string?) as item()*
biog:biog reads the main data table of cbdb: BIOG_MAIN. By calling all previous functions in this module, it performs a large join but it doesn't perform the write operation. In addition to the tables from previous functions, it also reads HOUSEHOLD_STATUS_CODES, ETHNICITY_TRIBE_CODES, and BIOG_SOURCE_DATA. biog:biog generates a person element for each unique person in BIOG_MAIN.
- $persons - is a
c_personid
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output, aborts on validation errors.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest, does NOT abort upon encountering validation errors..
<person ana="historical">...</person>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/biographies |
biog:name |
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
http://www.functx.com |
functx:pad-integer-to-length |
http://exist-db.org/apps/cbdb-data/biographies |
biog:alias |
http://exist-db.org/apps/cbdb-data/biographies |
biog:pers-add |
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
http://exist-db.org/apps/cbdb-data/global |
global:create-mod-by |
http://exist-db.org/apps/cbdb-data/biographies |
biog:posses |
http://exist-db.org/apps/cbdb-data/biographies |
biog:entry |
http://exist-db.org/apps/cbdb-data/biographies |
biog:event |
http://exist-db.org/apps/cbdb-data/biographies |
biog:inst-add |
http://exist-db.org/apps/cbdb-data/biographies |
biog:asso |
http://exist-db.org/apps/cbdb-data/biographies |
biog:new-post |
http://exist-db.org/apps/cbdb-data/biographies |
biog:kin |
http://exist-db.org/apps/cbdb-data/biographies |
biog:status |
declare function biog:entry($initiates as node()*) as node()*
biog:entry transforms ENTRY_DATA, ENTRY_CODES, ENTRY_TYPES, ENTRY_CODE_TYPE_REL, and PARENTAL_STATUS_CODES into a typed and annotated event. Currently, c_inst_code
, and c_exam_field
are empty. It's output should match the structure of biog:event.
- $initiates - is a
c_personid
<event>...</event>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:event($participants as node()*) as node()*
biog:event reads EVENTS_DATA, EVENT_CODES, EVENTS_ADDR to generate an event element. The structure of biog:event is mirrored by biog:entry. Currently, there are no 'py' or 'en' descriptions in the source data, hence we define a single xml:lang attribute on the parent element.
- $participants - is a
c_personid
<event>...</event>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:inst-add($participant as node()*) as node()*
biog:inst-add reads the BIOG_INST_DATA, and BIOG_INST_CODES generating an event. Time and place data are in where
, and when-custorm
respectively. The main location of institutions is as in listOrg.xml Currently there are no dates in this table?
- $participant - is a
c_personid
<event>...</event>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:kin($self as node()*) as node()*
biog:kin constructs an egocentric network of kinship relations from: KING_DATA, KING_CODES and Kin_Mourning. The output's structure should match biog:asso's. The list on page 13f of the CBDB User's Guide is incomplete. $tie
includes values not mentioned in the documentation.
- $self - is a
c_personid
<relation>...</relation>
declare function biog:name($names as node()*, $lang as xs:string?) as node()*
biog:name reads extended name parts from BIOG_MAIN. To avoid duplication biog:name checks if sure-/forename components can be fully identified, and returns the respective elements, otherwise persName takes a single string value.
- $names - variations of
c_name
from different tables. - $lang - can take 4 values:
- 'py' for pinyin,
- 'hz' for hanzi,
- 'proper', or
- 'rm' for names other then Chinese.
<persName>...</persName>
declare function biog:new-post($appointees as node()*) as node()*
biog:new-post reads POSTED_TO_OFFICE_DATA, POSTED_TO_ADDR_DATA, OFFICE_CATEGORIES, APPOINTMENT_TYPE_CODES, and ASSUME_OFFICE_CODES to generate socecStatus pointing to the office taxonomy. The precise role of POSTED_TO_ADDR_DATA is somewhat unclear.
- $appointees - is a
c_personid
<socecStatus scheme="#office">...</socecStatus>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:pers-add($resident as node()*) as node()*
biog:pers-add reads the BIOG_ADDR_DATA, and BIOG_ADDR_CODES to generate residence. BIOG_ADDR_CODES//no:c_addr_note would be a good addition to the ODD.
- $resident - is a
c_personid
:
<residence>...</residence>
Module URI | Function Name |
---|---|
http://www.functx.com |
functx:pad-integer-to-length |
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:posses($possessions as node()*) as node()*
biog:possess reads POSSESSION_DATA, POSSESSION_ACT_CODES, POSSESSION_ADDR, and MEASURE_CODES. It produces a state element. There is barely any data in here so future version will undoubtedly see changes.
- $possessions - is a
c_personid
<state type="possession">...</state>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:status($achievers as node()*) as node()*
biog:status reads STATUS_DATA, and STATUS_CODES and transforms them into state. Two tables are currently empty: STATUS_TYPES, and STATUS_CODE_TYPE_REL. This function drops c_notes
, and c_supplement
from STATUS_DATA
.
- $achievers - is a
c_personid
<state type = "status">...</state>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
declare function biog:write($item as item()*) as item()*
Because of the large number (>370k) of individuals the write operation of biographies.xql is slightly more complex. Instead of putting its data into a single file or collection, it creates a single listPerson directory inside the target folder, which is populated by further subdirectories and ultimately the person records. Currently, cbdbTEI.xml includes links to 37 listPerson files covering chunks of $chunk-size persons each (10k). "chunk" collections contain a single list.xml file and $block-size (50) sub-collections. This file contains xInclude statements to 1 listPerson.xml file per "block" sub-collection. Each block contains a single listPerson.xml file on the same level as the individual $ppl-per-block (200) person records .
- $item -
- Files and Folders for person data:
- Directories:
- creates nested directories listPerson, chunk, and block using the respective parameters.
- Files:
- creates list-X.xml and listPerson.xml files that include xInclude statements linking individual person records back to the main tei file.
- populates the previously generated directories with individual person records by calling biog:biog.
- Error reports from failed write attempts, as well as validations errors will be stored in the reports directory.
- Directories:
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/biographies |
biog:biog |
http://www.functx.com |
functx:substring-after-last |
http://www.functx.com |
functx:pad-integer-to-length |
- Module Uri: http://exist-db.org/apps/cbdb-data/calendar
The calendar module reads the calendar data from GANZHI, DYNASTIES, and NIANHAO to create a taxonomy element for inclusion in the teiHeader. The taxonomy consists of two elements one for the sexagenary cycle, and one nested taxonomy for reign-titles and dynasties. We are dropping the c_sort value for dynasties since sequential sorting is implicit in the data structure. There are some inconsistencies with how CBDB processes Chinese dates, in the long run using an external authority could solve these problems.
- Author: Duncan Paterson
- Version: 0.7
- see
- $cal:ZH - missing description
- $cal:path - missing description
- cal:custo-date-point
- cal:custo-date-range
- cal:dynasties
- cal:ganzhi
- cal:isodate
- cal:sexagenary
- cal:sqldate
declare function cal:custo-date-point($dynasty as node()*, $reign as node()*, $year as xs:string*, $type as xs:string?) as node()*
cal:custo-date-point takes Chinese calendar date strings (columns ending in *_dy
, *_gz
, *_nh
) . It returns a single tei:date
element using att.datable.custom
. cal:custo-date-range does the same but for date ranges. The normalized format takes DYNASTY//no:c_sort
which is specific to CBDB, followed by the sequence of reigns determined by their position in cal_ZH.xml followed by the Year number: D(\d*)-R(\d*)-(\d*)
- $dynasty - the sort number of the dynasty.
- $reign - the sequence of the reign period 1st = 1, 2nd = 2, etc.
- $year - the ordinal year of the reign period 1st = 1, 2nd = 2, etc.
- $type - can process 5 kinds of date-point:
- 'Start' , 'End' preceded by 'u' for uncertainty, defaults to 'when'.
<date datingMethod="#chinTrad" calendar="#chinTrad">input string</date>
declare function cal:custo-date-range($dy-start as node()*, $dy-end as node()*, $reg-start as node()*, $reg-end as node()*, $year-start as xs:string*, $year-end as xs:string*, $type as xs:string?) as node()*
This function takes Chinese calendar date ranges. It's the companion to cal:custo-date-point. It determines the matching end-points automatically when provided a starting point for a date range.
- $dy-start - the sort number of the starting dynasty.
- $dy-end -
- $reg-start - the sequence of the starting reign period 1st = 1, 2nd = 2, etc.
- $reg-end -
- $year-start - the ordinal year of the starting reign period 1st = 1, 2nd = 2, etc.
- $year-end -
- $type - has two options 'uRange' for uncertainty, default to certain ranges.
<date datingMethod="#chinTrad" calendar="#chinTrad">input string</date>
declare function cal:dynasties($dynasties as node()*, $mode as xs:string?) as item()*
cal:dynasties converts DYNASTIES, and NIANHAO data into categories.
- $dynasties - c_dy
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
<taxonomy xml:id="reign">...</taxonomy>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
declare function cal:ganzhi($year as xs:integer, $lang as xs:string?) as xs:string*
Just for fun: cal:ganzhi calculates the ganzhi cycle for a given year. It assumes gYears for calculating BCE dates.
- $year - gYear compatible string.
- $lang - is either hanzi = 'zh', or pinyin ='py' for output.
- ganzhi cycle as string in either hanzi or pinyin.
declare function cal:isodate($string as xs:string?) as xs:string*
cal:isodate turns inconsistent Gregorian year strings into proper xs:gYear type strings. Consisting of 4 digits, with leading 0s. This means that BCE dates have to be recalculated. Since '0 AD' -> "-0001"
- $string - year number in western style counting
- gYear style string
declare function cal:sexagenary($ganzhi as node()*, $mode as xs:string?) as item()*
cal:sexagenary converts GANZHI data into categories.
- $ganzhi - c_ganzhi_code
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
<taxonomy xml:id="sexagenary">...</taxonomy>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
declare function cal:sqldate($timestamp as xs:string?) as xs:string*
cal:sqldate converts the timestamp like values from CBDBs RLDBMs and converts them into iso compatible date strings, i. e.: YYYY-MM-DD
- $timestamp - collection for strings for western style full date
- string in the format: YYYY-MM-DD
- Module Uri: http://exist-db.org/apps/cbdb-data/genre
genre.xql combines $TEXT_BIBLCAT_CODES and $TEXT_BIBLCAT_TYPES into nested taxonomy elements. these are referenced from listBibl.xml. The exact difference between bibliographical category codes, and category types is unclear. This module joins them within on taxonomy and at the level speciefied in the sources.
- Author: Duncan Paterson
- Version: 0.7
declare function gen:nest-types($types as node()*, $type-id as node(), $zh as node(), $en as node(), $mode as xs:string?) as item()*
gen:nest-types recursively transforms TEXT_BIBLCAT_TYPES into nested categories.
- $types - row in TEXT_BIBLCAT_TYPES
- $type-id - is a
c_text_cat_type_id
- $zh - category name in Chinese
- $en - category name in English
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
- nested
<category xml:id="biblType">...</category>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
http://exist-db.org/apps/cbdb-data/genre |
gen:nest-types |
- Module Uri: http://exist-db.org/apps/cbdb-data/global
A set of helper functions and variables called by other modules.
- Author: Duncan Paterson
- Version: 0.7
- $global:ADDRESSES - missing description
- $global:ADDR_BELONGS_DATA - missing description
- $global:ADDR_CODES - missing description
- $global:ADDR_PLACE_DATA - missing description
- $global:ADDR_XY - missing description
- $global:ALTNAME_CODES - missing description
- $global:ALTNAME_DATA - missing description
- $global:APPOINTMENT_TYPE_CODES - missing description
- $global:ASSOC_CODES - missing description
- $global:ASSOC_CODE_TYPE_REL - missing description
- $global:ASSOC_DATA - missing description
- $global:ASSOC_TYPES - missing description
- $global:ASSUME_OFFICE_CODES - missing description
- $global:BIOG_ADDR_CODES - missing description
- $global:BIOG_ADDR_DATA - missing description
- $global:BIOG_INST_CODES - missing description
- $global:BIOG_INST_DATA - missing description
- $global:BIOG_MAIN - missing description
- $global:BIOG_SOURCE_DATA - missing description
- $global:CHORONYM_CODES - missing description
- $global:COUNTRY_CODES - missing description
- $global:CopyMissingTables - missing description
- $global:CopyTables - missing description
- $global:DATABASE_LINK_CODES - missing description
- $global:DATABASE_LINK_DATA - missing description
- $global:DYNASTIES - missing description
- $global:ENTRY_CODES - missing description
- $global:ENTRY_CODE_TYPE_REL - missing description
- $global:ENTRY_DATA - missing description
- $global:ENTRY_TYPES - missing description
- $global:ETHNICITY_TRIBE_CODES - missing description
- $global:EVENTS_ADDR - missing description
- $global:EVENTS_DATA - missing description
- $global:EVENT_CODES - missing description
- $global:EXTANT_CODES - missing description
- $global:FIX_AUTHORS - missing description
- $global:FormLabels - missing description
- $global:GANZHI_CODES - missing description
- $global:HOUSEHOLD_STATUS_CODES - missing description
- $global:KINSHIP_CODES - missing description
- $global:KIN_DATA - missing description
- $global:KIN_MOURNING_STEPS - missing description
- $global:KIN_Mourning - missing description
- $global:LITERARYGENRE_CODES - missing description
- $global:MEASURE_CODES - missing description
- $global:NIAN_HAO - missing description
- $global:NameAutoCorrectSaveFailures - missing description
- $global:OCCASION_CODES - missing description
- $global:OFFICE_CATEGORIES - missing description
- $global:OFFICE_CODES - missing description
- $global:OFFICE_CODES_CONVERSION - missing description
- $global:OFFICE_CODE_TYPE_REL - missing description
- $global:OFFICE_TYPE_TREE - missing description
- $global:PARENTAL_STATUS_CODES - missing description
- $global:PLACE_CODES - missing description
- $global:POSSESSION_ACT_CODES - missing description
- $global:POSSESSION_ADDR - missing description
- $global:POSSESSION_DATA - missing description
- $global:POSTED_TO_ADDR_DATA - missing description
- $global:POSTED_TO_OFFICE_DATA - missing description
- $global:POSTING_DATA - missing description
- $global:PasteErrors - missing description
- $global:SCHOLARLYTOPIC_CODES - missing description
- $global:SOCIAL_INSTITUTION_ADDR - missing description
- $global:SOCIAL_INSTITUTION_ADDR_TYPES - missing description
- $global:SOCIAL_INSTITUTION_ALTNAME_CODES - missing description
- $global:SOCIAL_INSTITUTION_ALTNAME_DATA - missing description
- $global:SOCIAL_INSTITUTION_CODES - missing description
- $global:SOCIAL_INSTITUTION_CODES_CONVERSION - missing description
- $global:SOCIAL_INSTITUTION_NAME_CODES - missing description
- $global:SOCIAL_INSTITUTION_TYPES - missing description
- $global:STATUS_CODES - missing description
- $global:STATUS_CODE_TYPE_REL - missing description
- $global:STATUS_DATA - missing description
- $global:STATUS_TYPES - missing description
- $global:TEXT_BIBLCAT_CODES - missing description
- $global:TEXT_BIBLCAT_CODE_TYPE_REL - missing description
- $global:TEXT_BIBLCAT_TYPES - missing description
- $global:TEXT_BIBLCAT_TYPES_1 - missing description
- $global:TEXT_BIBLCAT_TYPES_2 - missing description
- $global:TEXT_CODES - missing description
- $global:TEXT_DATA - missing description
- $global:TEXT_ROLE_CODES - missing description
- $global:TEXT_TYPE - missing description
- $global:TablesFields - missing description
- $global:TablesFieldsChanges - missing description
- $global:YEAR_RANGE_CODES - missing description
- $global:bibliography - missing description
- $global:calendar - missing description
- $global:doc - missing description
- $global:gaiji - missing description
- $global:genre - missing description
- $global:institution - missing description
- $global:main - missing description
- $global:modules - missing description
- $global:office - missing description
- $global:office-temp - missing description
- $global:patch - missing description
- $global:person - missing description
- $global:place - missing description
- $global:report - missing description
- $global:samples - missing description
- $global:src - missing description
- $global:target - missing description
declare function global:create-mod-by($created as node()*, $modified as node()*) as node()*
This function takes the standardized entries for creation and modification of cbdb entries and translates them into note elements. This data is distinct from the modifications of the TEI output recorded in the header.
- $created - is
c_created_by
- $modified - is
c_modified_by
<note type="created | modified">...</note>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:sqldate |
declare function global:validate-fragment($frag as node()*, $loc as xs:string?) as item()*
This function validates $frag by inserting it into a minimal TEI template. This function cannot guarantee that the final document is valid, but it can catch validation errors produced by other function early on. This minimizes the number of validations necessary to produce the final output.
- $frag - the fragment (usually some function's output) to be validated.
- $loc - accepts the following element names as root to be used for validation:
- category
- charDecl
- person
- org
- bibl
- place
- if validation succeeds then return the input, otherwise store a copy of the validation report
into the reports directory, including the
xml:id
of the root element of the processed fragment.
- Module Uri: http://exist-db.org/apps/cbdb-data/institutions
This module does what biographies does for persons for institutions.
- Author: Duncan Paterson
- Version: 0.7
declare function org:org($institutions as node()*, $mode as xs:string?) as item()*
This function transforms data from SOCIAL_INSTITUTION_CODES, SOCIAL_INSTITUTION_NAME_CODES, SOCIAL_INSTITUTION_TYPES, SOCIAL_INSTITUTION_ALTNAME_DATA, SOCIAL_INSTITUTION_ALTNAME_CODES, SOCIAL_INSTITUTION_ADDR, and SOCIAL_INSTITUTION_ADDR_TYPES into TEI. For now there are only three role
attribute values: academy, buddhist, and daoist. However, the altName tables, and address-type tables are empty!
- $institutions - is a
c_inst_code
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
<org>...</org>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:custo-date-point |
http://exist-db.org/apps/cbdb-data/calendar |
cal:custo-date-range |
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
- Module Uri: http://exist-db.org/apps/cbdb-data/office
To generating the taxonomy for office titles we need two query files office.xql and officeB.xql. office creates two files which will be merged by officeB. Each file stores a taxonomy for one of two different ways that offices are categorized by CBDB.
- Author: Duncan Paterson
- Version: 0.7
declare function off:nest-children($data as node()*, $id as node(), $zh as node(), $en as node()) as node()*
off:nest-children recursively transforms $OFFICE_TYPE_TREE into nested categories.
- $data - row in OFFICE_TYPE_TREE
- $id - is a
c_office_type_node_id
- $zh - category name in Chinese
- $en - category name in English
- nested
<category n ="...">...</category>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/office |
off:nest-children |
declare function off:office($offices as node()*, $mode as xs:string?) as item()*
off:office transforms OFFICE_CODES, OFFICE_CODE_TYPE_REL, and OFFICE_TYPE_TREE data into categories elements.
- $offices - is a
c_office_id
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
<category xml:id="OFF...">...</category>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
- Module Uri: http://exist-db.org/apps/cbdb-data/place
place.xql reads the various basic entities for location type information and creates a listPlace element for inclusion in the body element via xInclude. to avoid confusion 'addresses' type data in CBDB is 'place' data in TEI, whereas CBDB's 'place' is TEI's 'geo'. This data should soon be replaced with data from China Historical GIS
- Author: Duncan Paterson
- Version: 0.7
- see
declare function pla:fix-admin-types($adminType as xs:string?) as xs:string*
There are 225 distinct types of administrative units in CBDB, however these contain many duplicates due to inconsistent spelling. Furthermore, white-spaces prevent the existing types from becoming xml attribute values. Hence this function normalizes and concats the spelling of admin types without modifying the source.
- $adminType - is a
c_admin_type
- normalized and deduped string
declare function pla:nest-places($data as node()*, $id as node(), $zh as node()?, $py as node()?, $mode as xs:string?) as item()*
pla:nest-places recursively reads rows from ADDR_CODES and the first ADDR_BELONGS_DATA parent, to generate place elements. This leaves duplicate ids between here and ADDRESSES. Where multiple identical c_addr_id's are present, we use the one covering the largest admin level. All cases of overlapping dates for location data can actually be resolved to min/max.
- $data - is ADDR_CODES row elements
- $id - is a
c_addr_id
- $zh - placeName in Chinese
- $py -
- $mode - can take three effective values:
- 'v' = validate; preforms a validation of the output before passing it on.
- ' ' = normal; runs the transformation without validation.
- 'd' = debug; this is the slowest of all modes.
- nested
<place xml:id="PL...">...</place>
Module URI | Function Name |
---|---|
http://exist-db.org/apps/cbdb-data/calendar |
cal:isodate |
http://exist-db.org/apps/cbdb-data/place |
pla:fix-admin-types |
http://exist-db.org/apps/cbdb-data/global |
global:validate-fragment |
http://exist-db.org/apps/cbdb-data/place |
pla:nest-places |
declare function pla:patch-missing-addr($data as node()*) as node()*
pla:patch-missing-addr makes sure that every c_addr_id from CBDB is present in listPlace.xml . It does so by inserting empty places present in ADDRESSES but not ADDR_CODES, using a
- $data - row elements from ADDRESSES table.
<place>...</place>