Feature / Bug List
Modelling/Markup of Data: The modelling of the underlying raw data should follow a few basic principles: (1) it should be designed with human-readability in mind where possible, (2) it should attempt to minimize the total number of tags where possible (without losing )
- [complete] paragraph numbering
- [complete] alternative citations by which the document may be referenced
- [complete] abbreviated version of the case's full name for simplified display and automatic generation of indexes, case tables, etc.
- [complete] individually tagged dates to differentiate and allow search according to when a case was argued, etc.
- [complete] individually tagged outbound case citations.
- [complete] standardized expression of character entities (e.g., emdashes, section symbols, etc.)
- [complete] create 'docpart' level element
- [complete] change all "auth-title' attrib. to full judge titles (e.g., 'District Judge' not 'D.J.') for consistency.
- [unassigned] design a model for the optional/scalable breakup of history data into prior and subsequent. Explore options and possible need to indicate parallel opinions.
- [unassigned] Inclusion of related opinions in header info because of need to exclude citations from cite value analysis.
- [unassigned] Inclusion of disposition info (e.g., affirmed, reversed, remanded) in header box and how info might be coded in RDF stage.
- [unassigned] Inclusion of alternative source pagination data and tagging design to allow inbound pinpoint citation.
- [unassigned] Inclusion of one-line summary of document.
- [unassigned] Explore using "abbr" and "acronym" tags.
- [unassigned] Explore automatic tagging of media products (e.g., names of books, records, software, etc.)
- [unassigned] Revisit the issue of multiple data tags vs. a single tag with an optional attribute to set type.
Metadata Generation: While most of the metadata included in the final marked-up version of the court documents is simply an alternative representation of the information contained in the document's header, there is a significant amount of useful information that can be collected from outside sources. A primary example of this is a listing of cases that reference the case—this information is not contained in the document but can be gathered from the citing documents and incorporated into a document's RDF metadata expressions. Other examples are formal categories in which the document belongs and significant legal keywords draw from the document's text.
- [complete] licensing data (i.e., public domain) included in RDF metadata.
- [complete] define meaning and utility of Dublin Core tags to court document data.
- [pending] Build legal taxonomies for keyword and category generation.
- [pending] Inclusion of inbound citations as "dcterms:isReferencedBy" RDF statements.
- [pending] Inclusion of outbound citations as "dcterms:references" RDF statements.
- [unassigned] add internal rdf items for tracking a doc's level of markup and verification.
- [unassigned] Fix generation of RDF dates
- [unassigned] conversion of one-line summary data into "dc:description" RDF statements.
- [unassigned] implement conversion of court tag into "dc:creator" statements.
- [unassigned] implement conversion of
- [unassigned] explore need for generation of stand-alone RDF file.
Native Display: Although it is expected that the XML version of the data will be converted into any number of formats and custom displays, the xhtml display presented on the OpenGavel site provides a good test case to work solutions to some common display issues. Most of these issues involve finding a standards-compliant CSS solution but they can sometimes reveal an underlying issue with how the data was originally marked-up.
- [complete] document header (centering and dynamic 2-columns layout)
- [complete] paragraph numbering
- [complete] footnote numbering
- [complete] proper generation of footnotes attached to header elements
- [complete] party name highlighting
- [complete] TOC for multipart opinions.
- [complete] add Pub. Dom. footer/link.
- [pending] fix footnote display for non-para elements (e.g., fns. on headings not displaying, etc.)
- [pending] create context-sensitive ids for footnotes
- [pending] create links to return user from footnote to place in text.
- [unassigned] fix double superscript bug for asterisks as foot-refs.
- [unassigned] link/tab for listing of inbound/outbound citations.
- [unassigned] link/tab for original file (e.g., PDFs).
- [unassigned] save as file type feature.
- [unassigned] css stylesheet for print.
- [unassigned] fix Safari doc header display bug.
- [unassigned] fix display bug when auth="Per Curiam" results in comma and colon.
Administrative Issues
- [complete] index update form
- [pending] add automated index update to conversion process.
- [pending] rework admin scripts to OO scripting format (opin-data, opin-metadata).
- [pending] create line graph displaying repo size.
- [pending] purchase license for flash chart tool.
- [unassigned] add a universal find-replace script to make global changes (e.g., full auth-titles) easier.
- [unassigned] create error logs on errors instead of screen display.
- [unassigned] file sensitivity on find_??? function scripts
- [unassigned] add paypal donate button to donate page.
- [unassigned] create page focused on third party developers (e.g., guide doc., etc.)
- [unassigned] create page focused on user linking (e.g., how to link to specific paragraph, etc.).
- [unassigned] create bibliometrics page for stat info (most cited, most frequent parties, etc.). Use this to work on relevancy calculations (e.g., judicial impact factor) and base on creation of xml 'top X' logs.