Skip to main content

Currently Skimming:


Pages 3-120

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 3...
... A Guide for State Transportation Agencies V o l u m e I
From page 4...
... C o n t e n t s V o l u m e I A Guide for State Transportation Agencies I-5 Chapter 1 Finding Information When You Need It I-5 1.1 Business Drivers of Findability: Risks and Opportunities I-8 1.2 Guide Organization I-8 1.3 A Note on Terminology I-9 Chapter 2 Understanding Findability I-9 2.1 Overview I-10 2.2 Elements of Findability in a DOT I-12 2.3 Typical Impediments to Findability I-14 Chapter 3 Improving Findability I-14 3.1 Improving Information Management Discipline I-18 3.2 Improving Search and Navigation Capabilities I-24 3.3 Improving Metadata and Terminology Management I-34 Chapter 4 Planning for Findability Improvements I-34 4.1 Understanding User Needs I-37 4.2 Surveying the Information Landscape I-42 Chapter 5 Implementing Findability Improvements I-42 5.1 Establishing a Road Map for Improving Findability I-46 5.2 Putting Management Functions and Processes in Place I-53 References I-54 Abbreviations I-56 Appendix A Example Improvement Initiatives I-56 Improvement 1: Focus on Findability of Construction Project Information I-59 Improvement 2: Focus on Findability of Critical Corporate Documents I-62 Improvement 3: Focus on Findability of Information for Critical Job Functions I-65 Appendix B Glossary I-71 Appendix C Special Topics I-71 Topic 1: Search I-79 Topic 2: Metadata I-82 Topic 3: Text Analytics I-88 Topic 4: Terminology and Semantic Structures to Improve Search I-93 Topic 5: Integration Considerations for Enterprise Search I-95 Appendix D DOT Information Organization Resources I-110 Appendix E Examples of Commercially Available Enterprise Search and Text Analytics Products
From page 5...
... I-5 1.1 Business Drivers of Findability: Risks and Opportunities Over the past two decades, technologies for creating and sharing information in electronic form have become pervasive. Like most large organizations, transportation agencies have experienced challenges managing a growing collection of information including tabular data, spatial data files, Computer-Aided Design and Drafting (CADD)
From page 6...
... I-6 Improving Findability and relevance of transportation Information In addition to the risks noted above, there are hidden costs to poor findability. When a DOT's information is not well-organized, accessible, and easily searchable, employees spend a great deal of time looking for relevant, accurate information.
From page 7...
... Finding Information When You need It I-7 For DOTs, here are some realistic scenarios that could be avoided through improved findability: • The DOT receives a FOIA request asking for all information related to the design of a new interchange. The requestor wants information on the chronology of key design decisions, including meeting notes, plans, and emails.
From page 8...
... I-8 Improving Findability and relevance of transportation Information applicability, and a lack of consistency in both search interfaces and information management practices, resulting in confusion on the part of people searching for information. An agency-wide approach to findability takes resources and coordinated action within the organization.
From page 9...
... I-9 C h a p t e r 2 This chapter introduces the different elements of findability. A holistic understanding of these elements is important for identifying appropriate improvements.
From page 10...
... I-10 Improving Findability and relevance of transportation Information 2.2 Elements of Findability in a DOT Figure I-3 illustrates in more detail the different elements of findability within a DOT. A typical search scenario involves: • Users (people seeking information)
From page 11...
... Understanding Findability I-11 • Information repositories. The information being sought must be stored in a location from which it can be retrieved.
From page 12...
... I-12 Improving Findability and relevance of transportation Information 2.3 Typical Impediments to Findability Fundamental impediments to findability include lack of disciplined information management practices, non-searchable information, lack of investment in common information repositories and search tools, and lack of a consistent approach to metadata and use of terminology for information classification. Lack of Disciplined Information Management Practices • Data and information resources are not handled as organizational assets with deliberate planning of acquisition, maintenance, retirement and valuation for long-range planning.
From page 13...
... Understanding Findability I-13 • Inherent limitations of full-text search exist with regard to going beyond literal matching of terms and searches based on meaning. (This situation is improving with advancements in text analytics and machine learning.)
From page 14...
... I-14 C h a p t e r 3 Three overarching strategies for improving findability are: • Improving information management discipline by cultivating disciplined practices for creating, naming, storing, versioning, and culling content. • Improving search and navigation capabilities by implementing and configuring tools to explore available content and locate items of interest.
From page 15...
... Improving Findability I-15 • Where content should be stored. • How to name files.
From page 16...
... I-16 Improving Findability and relevance of transportation Information or pruning activities typically will not happen on their own. Managers of different information repositories must provide opportunities, tools, services, and motivations for eliminating ROT.
From page 17...
... Improving Findability I-17 Standard File Scanning Protocols Scanning protocols should ensure that the content is text-searchable. Non-OCR-compliant image files are not subject to full-text search capabilities.
From page 18...
... I-18 Improving Findability and relevance of transportation Information authoritative version at any given time? Where appropriate, do these workflow and responsibility definitions enable auditing of changes?
From page 19...
... Improving Findability I-19 The disparity between users' expectations for enterprise search and the ability of existing tools to meet these expectations also was observed in a recent paper (Stocker 2014)
From page 20...
... I-20 Improving Findability and relevance of transportation Information Search Tools Many DOTs have deployed content or document management systems and collaboration tools that have embedded search capabilities. For example, collaboration tools typically include a search box to help users navigate to content of interest.
From page 21...
... Improving Findability I-21 For example, a bridge design manual could be categorized using the following facets: • Content type: Manual • Mode: Roadway • Asset type(s) : Structures and Bridges • Document owner: Bridge Division • Business function: Design • Sensitivity: Public • Status: Current • Issue date: April 1, 2015 If all of an agency's corporate documents were classified using these facets, it would be possible to build a one-stop shop for documents -- one that would allow employees in any division to find all of the: • Corporate documents related to roadway design.
From page 22...
... I-22 Improving Findability and relevance of transportation Information Implementing standard classifications in an agency can be challenging because it involves getting agreement across multiple business units (that may already have competing or inconsistent classification methods)
From page 23...
... Improving Findability I-23 delivery model for many business applications, including office and messaging software, payroll processing software, database management system (DBMS) software, CADD software, accounting, collaboration, customer relationship management, content management, and service desk management.
From page 24...
... I-24 Improving Findability and relevance of transportation Information Questions to Ask An assessment of agency search and navigation capabilities should consider the following questions: • Are the range of search needs well understood within the agency? What types of needs are most important to address from an agency business perspective?
From page 25...
... Improving Findability I-25 As used in this context, terminology refers to a variety of controlled vocabulary and semantic resources including glossaries, lists of synonyms, taxonomies, thesauri, and ontologies. These resources can be used to: • Provide standard lists of values for certain metadata elements (e.g., a list of DOT organizational units, project phases, or infrastructure asset types)
From page 26...
... I-26 Improving Findability and relevance of transportation Information publicly available metadata standards for different content types and extend them as needed for internal use. Table I-2 lists selected standards that may be relevant to DOTs.
From page 27...
... Improving Findability I-27 Scope Standard General Dublin Core (ISO 15836) – 15 standard elements: Contributor, Coverage, Creator, Date, Description, Format, Identifier, Language, Publisher, Relation, Rights, Source, Subject, Title, Type.
From page 28...
... I-28 Improving Findability and relevance of transportation Information Frequently, efforts to develop metadata schemes and ways of classifying information are performed in the context of particular initiatives (e.g., implementation of a new content management system)
From page 29...
... Improving Findability I-29 Agency master data that could be of value to identify for integrating into search capabilities includes: • Districts/regions • Organizational units • Business function categories (more stable than organizational unit names, which tend to shift fairly often due to reorganization) • Maintenance areas Scope Resources Transportation Research Transportation Research Thesaurus Reference: http://trt.trb.org/trt.asp Australian Transport Index Thesaurus Reference: https://www.arrb.com.au/admin/file/content2/c7/ATRI_2013.pdf Public Sector Subject Vocabulary (UK)
From page 30...
... I-30 Improving Findability and relevance of transportation Information • Maintenance activities • Projects • Contracts • Vendors • Employees • Partner organizations • Routes • Road sections • Bridges • Financial accounts • Funding sources • Programs Embedding Metadata Creation into Information Management Workflow Where possible, creation of metadata should be integral to workflows for creating and storing information resources. For example, part of saving a document within a document management system typically requires a user to fill in a set of metadata elements.
From page 31...
... Improving Findability I-31 as the full text of the document. A folder structure can be set up (within a file drive or within a document management system)
From page 32...
... I-32 Improving Findability and relevance of transportation Information Option 4: Centrally and Manually Assigned Metadata. This is the traditional approach to metadata assignment used in libraries.
From page 33...
... Improving Findability I-33 The disadvantage of this approach is that the cost of text analytics software is not insignificant, typically running in the six figures. A significant development effort also is required to establish rules for metadata assignment.
From page 34...
... I-34 C h a p t e r 4 Any effort to improve findability should begin with an understanding of: • The users (what they are seeking and why)
From page 35...
... planning for Findability Improvements I-35 • How much time do employees spend responding to questions when the information is (or should be) readily available on the agency's intranet site?
From page 36...
... I-36 Improving Findability and relevance of transportation Information search approaches, need for content classification metadata, and so forth. Understanding of requirements also provides the basis for determining whether the search need would be best addressed via existing content management systems, improvements to general enterprise search capabilities, or development of a search-based application that targets a specific business process or user group.
From page 37...
... planning for Findability Improvements I-37 • Reasons for limited search success: – Target material not available – Lack of search skills (e.g., use of wildcards, use of quotation marks) – Poorly configured search tools – Poor performance of text search – Poor metadata quality An online survey is another approach that can be used to gain a broader understanding of search needs and behaviors.
From page 38...
... I-38 Improving Findability and relevance of transportation Information Repositories may include: • An enterprise collaboration platform with sites for organizational units and project teams, used for active document sharing (e.g., SharePoint)
From page 39...
... planning for Findability Improvements I-39 Understanding Content Types and Formats Once the relevant repositories have been identified, the next step is to understand what content types and formats they include. For each repository, the goal is to obtain a breakdown of both the number of information resources (e.g., documents, data tables, images)
From page 40...
... I-40 Improving Findability and relevance of transportation Information – Public notices – Software manuals – Specifications – Timesheets – Training materials Content type classifications can be useful for establishing the scope of a findability improvement effort based on agency priorities and user needs. This type of content classification also can be used as a framework for defining information organization policies (what types of content should be stored where)
From page 41...
... planning for Findability Improvements I-41 • Document/classify – Is any informal or formal process in place to assign metadata to the content (e.g., populating document properties for a spreadsheet; completion of a metadata form for a spatial data file, updates to data dictionaries)
From page 42...
... I-42 C h a p t e r 5 Chapter 3 of this guide reviewed strategies that can be pursued to improve findability. Chapter 4 covered information gathering and analysis needs for planning improvements.
From page 43...
... Implementing Findability Improvements I-43 to search across different repositories. Agreeing on a core set of metadata elements does not necessarily mean that the agency intends to create metadata for every piece of content; rather, it means that where metadata is to be created, it will utilize the core elements.
From page 44...
... I-44 Improving Findability and relevance of transportation Information • GIS unit • Data management unit • Risk management unit Executives can provide their vision about what and how information needs to be easily located and accessed in order to ensure effective core business functions. Representative staff from core business units (e.g., in planning, engineering/design, construction, maintenance, and operations)
From page 45...
... Implementing Findability Improvements I-45 Findability should be an important consideration in the selection and implementation of any of these initiatives. Keeping track of these initiatives and proactively working with the sponsors and project teams to ensure that each initiative incorporates elements of the overall findability vision allows the agency to leverage its ongoing technology investments to improve agency-wide findability.
From page 46...
... I-46 Improving Findability and relevance of transportation Information One effective combination is to implement content management software, content workflow automation, text analytics software, and faceted navigation. For example: • Content management software provides a platform for content creation and storage.
From page 47...
... Implementing Findability Improvements I-47 functions and processes for information governance that improve findability is to get it on the radar screen of agency leadership. It may be unrealistic to expect agency leaders to make information findability a constant or central area of focus, but it is important that they understand and support some key ideas.
From page 48...
... I-48 Improving Findability and relevance of transportation Information • Establish a "home base" for decisions about findability improvements. – Create an agency information governance board with responsibility for improving information findability (as one element of its charter)
From page 49...
... Implementing Findability Improvements I-49 – Adherence to established data and metadata standards. – Frequency of review and required retention periods.
From page 50...
... I-50 Improving Findability and relevance of transportation Information Expertise for Findability Specialized skills are required to perform many of the functions described in the preceding sections. Although some of these skills can be learned on the job, it is important to have a core team (at least two-to-three individuals)
From page 51...
... Implementing Findability Improvements I-51 Tracking Progress and Measuring Performance Defining performance measures for findability is critical to an effective outcome: What exactly does success look like? How do managers track progress?
From page 52...
... I-52 Improving Findability and relevance of transportation Information Attribute Sample Measures Content Availability • Amount or percentage of content of a specific type that is in electronic form (based on content analysis) • Amount or percentage of content of a specific type that is text-searchable (based on content analysis)
From page 54...
... I-54 AIIM Association for Image and Information Management ANSI American National Standards Institute BI Business Intelligence CADD Computer-Aided Design and Drafting DBMS Database Management System DCMI Dublin Core Metadata Initiative DOT Department of Transportation ECM Enterprise Content Management System FGDC Federal Geographic Data Committee FOIA Freedom of Information Act GEC General Engineering Consultant GIS Geographic Information System GPO Government Printing Office GSA Google Search Appliance HQ Headquarters HR Human Resources HTML Hypertext Markup Language HOV High Occupancy Vehicle ISO International Standards Organization IT Information Technology ITS Intelligent Transportation Systems KDOT Kansas Department of Transportation LCSH Library of Congress Subject Headings LiDAR Light Detection and Radar MeSH Medical Subject Headings MMIS/SES Maintenance Management Information System/Single Entry Screen NIEM National Information Exchange Model NLM National Library of Medicine NISO National Information Standards Organization OCLC Online Computing Library Consortium OCR Optical Character Recognition OR Operation Region PDF Portable Document Format PO Purchase Order PS&E Plans, Specifications, and Estimates RACI Responsible, Accountable, Consulted, Informed ROT Redundant, Outdated, and Trivial ROW Right-of-Way Abbreviations
From page 55...
... Abbreviations I-55 SaaS Software as a Service SBA Search-Based Application SEO Search Engine Optimization SQL Structured Query Language TRT Transportation Research Thesaurus UTP Unified Transportation Plan
From page 56...
... I-56 A p p e n d i x A Example Improvement Initiatives Improvement 1: Focus on Findability of Construction Project Information The Situation DOT X has implemented a content management solution for its design drawings, but other content related to construction projects is created and managed at the district level, and there is no single central repository for this information. Practices vary with respect to what content is captured electronically, what file formats are used, where content is stored, and how content is organized.
From page 57...
... example improvement initiatives I-57 Contracts Change orders Invoices Inspection reports/notes/photographs Materials test results Emails - Map out existing repositories with construction project-related information: Collaboration software team sites Engineering file repository – design plans Capital program management application – status, finding, schedule File cabinets/desk drawers District websites Understand Findability Needs of Different Users: - Conduct focus groups with construction project managers in each district to identify their information retrieval needs and level of satisfaction. - Conduct focus groups with maintenance personnel to identify their needs for construction project information.
From page 58...
... I-58 improving Findability and Relevance of Transportation information Solution Element Implementation Activities Information Governance Identify required and optional information to be stored and managed for construction projects. Develop standard file naming conventions for each content type.
From page 59...
... example improvement initiatives I-59 Solution Element Implementation Activities Content Conversion Determine initial target for conversion of legacy content and complete conversion effort. Process legacy content through auto-categorization function to assign metadata and manually review/adjust results.
From page 60...
... I-60 improving Findability and Relevance of Transportation information Planning the Improvement Survey the Information Landscape: - Identify types of content to be included in the effort: Operations and Engineering Policies and Manuals – Road Design, Bridge Design, Bridge Inspection, Traffic Engineering, Maintenance, Access Management, Utilities Coordination, Geotechnical, Hydraulic, Landscape, Project Development, Right-of-Way Administrative Policies and Manuals – Business Manual, Risk Management Policy, Records-Retention Policy, Legal and Litigation Hold Policy Human Resources Policies and Manuals – Sick Leave Policy, Workers' Compensation Policy, Ethics Code, Family Medical Leave, Affirmative Action Financial Management Policies and Manuals – Invoicing, Revenue and Accounts Receivable, Advance Construction, Debt Management, Cash Balance, Local Assistance IT Policies and Manuals – System Access Management, Data Protection, System Procurement, System Development Life Cycle - Map out existing repositories that store current policies, procedures and manuals: Intranet web pages – departmental, regional External website pages Collaboration software team sites Shared network drives File cabinets/desk drawers Understand Findability Needs of Different Users: - Conduct focus groups with a sample of employees in central and field offices – assess their understanding of how to access updated policies and manuals. - Conduct interviews with a sample of managers to understand extent to which out-of-date materials are being used, and associated risks – gather illustrative examples.
From page 61...
... example improvement initiatives I-61 Search engines and interfaces Information repositories Classification and metadata Information producers - Identify specific tasks for implementing the solution. Solution Element Implementation Activities Information Governance Identify and document which types of policy and procedural documents are to be governed centrally.
From page 62...
... I-62 improving Findability and Relevance of Transportation information Solution Element Implementation Activities Documentation and Training Develop training materials for content creators on update procedures. Communicate new process to employees through targeted emails, website announcements, and presentations.
From page 63...
... example improvement initiatives I-63 Meeting notes Document templates (e.g., standard letters, forms, etc.) - Map these content types to existing information repositories/sources: Employee directory Internal web content management system External websites or databases, RSS feeds, bookmarks Shared network drives Document management system Learning management system IT servers Document the Information Management Life Cycle: - For internally-produced content types of relevance, document the existing processes for updates and metadata assignment.
From page 64...
... I-64 improving Findability and Relevance of Transportation information Solution Element Implementation Activities Terminology Resources Develop the categories and standard keywords to be used for information organization and search. Leverage existing taxonomies (if available)
From page 65...
... I-65 A p p e n d i x B Glossary This glossary draws on the following sources: AIIM – Association for Information and Image Management Glossary: http://www.aiim.org/community/wiki/view/glossary IRMT – International Records Management Trust (IRMT) Glossary of Terms: http://www.irmt.org/documents/educ_training/term%20modules/IRMT%20TERM% 20Glossary%20of%20Terms.pdf ANSI/NISO Z39.19 – Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies (2005)
From page 66...
... I-66 improving Findability and Relevance of Transportation information Content object. An individual unit of content that may be described for inclusion in an information retrieval system, website, or other information source.
From page 67...
... Glossary I-67 Electronic discovery (e-discovery)
From page 68...
... I-68 improving Findability and Relevance of Transportation information Information management. The means by which an organization (e.g., a DOT)
From page 69...
... Glossary I-69 an ontology might define a relationship called "is a structural member of" to describe the structural elements of a bridge (e.g., trusses) and distinguish these from non-structural elements (e.g., railings)
From page 70...
... I-70 improving Findability and Relevance of Transportation information Spider. A computer program that scans the World Wide Web, following links on each page to identify new sites.
From page 71...
... I-71 A p p e n d i x C Special Topics This appendix provides more detailed coverage of several topic areas related to search, metadata and terminology, and text analytics, supplementing the main body of the guide. Topic 1: Search Search Engines: The Basics Search engines are an essential technology supporting digital information retrieval.
From page 72...
... I-72 improving Findability and Relevance of Transportation information Since that initial breakthrough, Internet search has continued to improve. New features include refinement of relevance ranking based on user search history, and use of more advanced natural language processing techniques to better discern what the user is seeking.
From page 73...
... Special Topics I-73 Federated search is becoming a standard for enterprise search across most industries and government agencies. Most enterprise search vendors offer the capability of setting up a federated search.
From page 74...
... I-74 improving Findability and Relevance of Transportation information Implementation Tips: Supplement the text box with an advanced search capability for advanced users. Support use of quotation marks around multi-word phrases for phrase searches.
From page 75...
... Special Topics I-75 Implementation Tips: Support advanced search (e.g., find presentations and spreadsheets that have been created in the last 12 months within one of the district offices and tagged with the categories "Meetings" and "Budget")
From page 76...
... I-76 improving Findability and Relevance of Transportation information Independence. In general, facets should represent different types of characteristics (e.g., file type, organizational unit, and status)
From page 77...
... Special Topics I-77 general, supporting multiple values within each facet has been shown in usability tests to be too advanced/confusing to most users.) Implementation Tips: Keep the number of facets between three and seven.
From page 78...
... I-78 improving Findability and Relevance of Transportation information The dictionary of terms used in auto-suggest may be based on the history of previous searches by the individual user or by all users on the system, based on a curated controlled vocabulary of terms or a combination of the two. Advantages of using auto-suggest/type-ahead are: Auto-suggest/type-ahead combines the benefits of searching (quickly and easily looking for something)
From page 79...
... Special Topics I-79 Advantages of using the Best Bets approach are: It provides a way of delivering better search results than using relevance rankings. It saves users the time and effort required to scan long search results lists in order to find a document of interest.
From page 80...
... I-80 improving Findability and Relevance of Transportation information and moved to the top of the results list. The fact that this term was specifically included in the metadata provides a more reliable indicator that the document is about "transportation" than the number of times the word "transportation" appears in the document.
From page 81...
... Special Topics I-81 One of the most well-known metadata standards is the Dublin Core Metadata Element Set. Work on this standard originated at a 1995 workshop held in Dublin, Ohio, sponsored by the Online Computing Library Consortium (OCLC)
From page 82...
... I-82 improving Findability and Relevance of Transportation information In conjunction with deciding what type of metadata to maintain, it is important to develop an approach to metadata assignment. Topic 3: Text Analytics Background Text analytics software was first developed in the 1990s and commercialized in the 2000s.
From page 83...
... Special Topics I-83 Figure I-C-7. Text analytics application for known entity extraction.
From page 84...
... I-84 improving Findability and Relevance of Transportation information tenses and other variants)
From page 85...
... Special Topics I-85 Figure I-C-9 illustrates application of a more complex set of rules that look for agricultural terms that: Appear in the first 100–200 words of the document AND Are not part of certain phrases (e.g., "Department of Agriculture" -- a formal phrase that shows up in multiple documents that are not really about agriculture)
From page 86...
... I-86 improving Findability and Relevance of Transportation information Figure I-C-10. Text analytics rules template for auto-categorization.
From page 87...
... Special Topics I-87 Figure I-C-11. Federated search with text analytics.
From page 88...
... I-88 improving Findability and Relevance of Transportation information Categorization by example is most useful when there are a limited number of categories and no need for a taxonomy or ontology. It is often considered to be superior because it does not require humans creating categorization rules, which is a skill that SMEs typically do not have.
From page 89...
... Special Topics I-89 In the TRT controlled vocabulary, for example, bridge decks are found in the following hierarchy: Bridges and Culverts Bridges Bridge members Bridge Superstructures Bridge Decks If one were searching for all research articles on "bridge decks", the controlled vocabulary in the TRT would effectively take the searcher to what they are looking for. In addition, because the controlled vocabulary also addresses the use of synonyms, if one searched on the terms "bridge slab" or "bridge surface", the result of the search would include all articles on bridge decks.
From page 90...
... I-90 improving Findability and Relevance of Transportation information support classification, categorization, and concept organization when designing a search structure. They allow users to browse and navigate the taxonomy from the top down, from broader to narrower terms.
From page 91...
... Special Topics I-91 Potential disadvantages are: Relevant subjects might not all fit into a neat hierarchical structure. Comprehensive taxonomies can become too large to browse.
From page 92...
... I-92 improving Findability and Relevance of Transportation information Implementation Tips: Follow the ANSI/NISO Z39.19-2005 (R2010) standard: Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies Semantic Networks Figure I-C-13.
From page 93...
... Special Topics I-93 Semantic networks are often used by text analytics software to enhance their ability to interpret and process language. They are also utilized within semantic search and ontologybased applications that are able to apply reasoning based on semantic relationships.
From page 94...
... I-94 improving Findability and Relevance of Transportation information content objects. Another repository might have a "jurisdiction" field.
From page 95...
... I-95 A p p e n d i x d DOT Information Organization Resources Several examples of DOT classification schemes are provided in this appendix. These are intended as illustrative of current practice rather than as examples of best practice.
From page 96...
... I-96 improving Findability and Relevance of Transportation information Example 1: Kansas DOT Enterprise Architecture – DOT Entities An enterprise architecture initiative at the Kansas DOT identified the following hierarchical list of high level entities. Each of these entities could be defined as a concept that a given piece of information might be "about": Human Resources Employees Positions Organization Structure Project Information Programs Projects Activities Contracts Location – Geospatial Area-based Point-based Linear-based Cities Counties Geographic areas Geographic Maps Digital Photos (ortho quads)
From page 97...
... dOT information Organization Resources I-97 Public Entities Other Government Entities Financial Data Funds Budgets Financial Transactions Grants Accounts Bonds Loans Assets Buildings Equipment Land Materials Office Equipment Public Transportation Rest Areas Shop Equipment Storage Areas Towers Consumable Inventory Information Systems and Technology Example 2: Texas DOT Document Management This example, used for document classification, is based on DOT function. Document Class Description 1.
From page 98...
... I-98 improving Findability and Relevance of Transportation information Document Class Description 5. Equipment and Facilities Documents related to building facilities and equipment, including building construction; maintenance, operations and security; equipment operations and maintenance; hazardous materials.
From page 99...
... dOT information Organization Resources I-99 Document Class Description 14. Traffic Operations Documents related to district traffic operations, including sign requests and issues, traffic program and projects, traffic safety grants and traffic signal maintenance.
From page 100...
... I-100 improving Findability and Relevance of Transportation information 4.4 Recruitment 4.5 Workforce Estimates 5.0 Payroll 5.1 Memos – Directives 5.2 Forms 5.3 Schedules 5.4 Leave Report 5.5 Labor Reporting 5.6 Timesheets 5.7 Washington State DOT Employees Tracking Sheets 6.0 Secure Facilities 6.1 Memos – Directives 6.2 Policies 6.3 Contracts 6.4 Maintenance 6.5 Floor plan 6.6 HOV Office Start Up 6.7 Custodial Inspection Sheet 6.8 Office Moves 7.0 Safety 7.1 Memos – Directives 7.2 Secure Emergency Contacts 7.3 Building Evacuation Plan 7.4 Safety Forms 7.5 Safety Meetings 7.6 Safety Awards 7.7 Accident Reports 8.0 Washington State DOT Administration 8.1 Memos – Directives 8.2 Forms 8.3 Secretary
From page 101...
... dOT information Organization Resources I-101 8.4 Phones & IT 8.5 Vehicles 8.6 Washington State DOT Manuals 8.7 Equipment Manuals 8.8 Correspondence 8.9 Parking and Keys 9.0 Secure Communications 9.1 Memos – Directives 9.2 Incoming Correspondence 9.3 Outgoing Correspondence 9.4 How To instructions 9.5 HQ & OR Communications 9.6 Reporting & Oversight 9.7 Planning 9.8 Internal Communications 9.9 External Communications 9.10 Maps & Graphics 9.11 Photos 9.12 Meeting Agendas 9.13 Special Projects 10.0 Secure HOV Management 10.1 Memos – Directives 10.2 Contracting 10.3 Transition 10.4 Invoice Review Project Filing 1.0 Northbound XL3498 1.1 CAD 1.1.1 BaseFiles 1.1.2 CADDoc 1.1.3 FromDesign 1.1.4 Rsc
From page 102...
... I-102 improving Findability and Relevance of Transportation information 1.1.5 As-Builts 1.1.6 ChangeOrders 1.1.7 ContractPlans 1.1.8 PlansforApproval 1.1.9 RightofWayPlans 1.2 EngDataConst 1.2.1 ConstDoc 1.2.2 Deliverables 1.2.3 RWKs 1.2.4 Standards 1.2.5 Geometry 1.2.6 Libraries 1.2.7 Reports 1.2.8 Surfaces 1.2.9 Survey 1.3 EngDataDesign 1.3.1 Deliverables 1.3.2 DesignEngDoc 1.3.3 RWKs 1.3.4 Standards 1.3.5 Geometry 1.3.6 Libraries 1.3.7 Reports 1.3.8 Surfaces 1.4 Environmental 1.5 Estimates 1.6 Hydraulics_Report 1.7 Permits 1.8 Photogrammetry 1.9 Photos 1.10 Project_Documentation 1.11 Quantities
From page 103...
... dOT information Organization Resources I-103 1.12 Scoping 1.13 Survey 1.13.1 Deliverables 1.13.2 Requests 1.13.3 SurveyDoc 1.13.4 RawData 1.13.5 WorkingData Example 4: Mississippi DOT Internal Services Audit Human Resources Information Systems Legal Public Affairs State Aid Commission Administrative Services Asset Management Facilities and Records Management Financial Management General Services Procurement Special Projects Support Services Projects Bridge Construction Contract Administration Consulting Services Environmental Enforcement Maintenance Materials Planning Programming Rails Research
From page 104...
... I-104 improving Findability and Relevance of Transportation information Right-of-way Roadway Design Traffic Engineering Transportation Information (GIS) Districts District 1 District 2 District 3 District 4 District 5 District 6 District 7 Internal Planning Aeronautics Freights Ports Waterways Public Transit Program-Specific Plans and Studies - Long-Range Transportation Plan - Modal Plans - Freight Plan - Corridor Plans - Asset Management Plans - Transportation Studies - Traffic Engineering Studies - Safety Studies - Cost Allocation Studies - Research Reports - Customer Surveys - Employee Surveys Program Development - Grant Applications and Awards - Needs/Candidate Project Lists - Program Plans - Program Performance Reports - State Transportation Improvement Plan Engineering - Design Standards and Specifications - Product Evaluations
From page 105...
... dOT information Organization Resources I-105 - Land Surveys - Structure Inspection Reports - Design Plans/Engineering Drawings - Value Engineering Studies Project Development - Right-of-Way Maps - Utility Relocation Records - Property Acquisition Records - Property Deeds and Titles - Categorical Exclusions - Environmental Impact Statements - Environmental Assessments - Environmental Decisions - Engineering Drawings/Plans - Engineering Calculations - Project Cost Estimates Construction Projects - Project Advertisement Reports - Bid Notices - Bid Proposals - Construction Agreements - Construction Contracts - Subcontracts - Change Orders - Daily Inspection Reports - Materials Test Reports - Claims - As-Built Plans Maintenance and Operations - Maintenance and Operations Procedures - Customer Complaint Reports - Utility Permits - Access Permits - Outdoor Advertising Permits - Oversize/Overweight Permits - Maintenance Records - Signal Timing Records - Incident Logs/Reports - Crash Records
From page 106...
... I-106 improving Findability and Relevance of Transportation information Administrative - Policy and Procedure Guidelines and Manuals - Calendars and Schedules - Administrative Contracts and Agreements - Correspondence - Business Plans - Press Releases - Public Notices - Meeting Notes - Public Records Requests - Contact Lists - Newsletters and Publications - Audit Reports - Federal Grantee Reports - Contractor Compliance Reports Financial - Invoices - Budgets - Financial Reports Plant and Facilities - Equipment Manuals - Equipment Inventory - Building Inventory - Property Disposition Records - Work Orders - Maintenance Reports - Inspection Reports Example 5: Virginia DOT Document Descriptors The following keywords (from a controlled vocabulary) are used to characterize document types within the Virginia DOT collaboration site: Audit Budget Contract Employee Benefits Evaluation Facilities Form Legal Legislative Lessons Learned Manual
From page 107...
... dOT information Organization Resources I-107 Memorandum Org Chart Performance Permit Plan Sheet Policy Presentation Procedure Project Report Security Specification Strategic Plan Template Training Material Example 6: Washington State DOT Asset Types This example shows an information classification scheme. The Washington State DOT conducted a research project with Kent State University to develop a taxonomy of asset types (see Winkler [2014]
From page 108...
... I-108 improving Findability and Relevance of Transportation information Figure I-D-1. Washington State DOT high level asset classification scheme (proposed)
From page 109...
... dOT information Organization Resources I-109 Example 7: "Typical" DOT Organizational Functions Figure I-D-2 shows a hierarchical set of DOT organization functions, developed based on a review of several DOT organization charts. This is a resource that could be used to develop a set of functional categories for DOT information classification that was generic in nature (i.e., independent of specific actual business units, which can and do change over time)
From page 110...
... I-110 A p p e n d i x e Examples of Commercially Available Enterprise Search and Text Analytics Products This appendix presents additional information on companies that provide text analytics products and software. The enterprise collaboration platform software and other products mentioned reflect examples of products currently available at the time of this research and are included as information without implying specific endorsement.
From page 111...
... examples of Commercially Available enterprise Search and Text Analytics products I-111 Table I-E-1. Examples of commercially available enterprise search products.
From page 112...
... I-112 improving Findability and Relevance of Transportation information Company/Product Description General Comments Sample Customers or Partners a Solr Solr is an open source enterprise search platform built on Lucene (an open source text search engine library written in Java)
From page 113...
... examples of Commercially Available enterprise Search and Text Analytics products I-113 Company Product Description General Comments Sample Customers or Partners a BA Insight Smart Analytics SharePoint based text analytics and search analytics; includes autoclassification capabilities They have multiple connectors to incorporate content from sources outside of SharePoint U.S. Department of Homeland Security; U.S.
From page 114...
... I-114 improving Findability and Relevance of Transportation information Company Product Description General Comments Sample Customers or Partners a Data Harmony // Access Innovations Thesaurus Master ® Controlled vocabulary development; provides thesaurus and taxonomy management Primarily a taxonomy and vocabulary management company; have a full-featured text analytics capability M.A.I.™ Document indexing; Statistics Collector submits suggestions for improvement MAIstro ® Bundles Thesaurus Master and M.A.I.; includes additional features such as metadata extractor http://www.dataharmony.com/products/ Expert System Cogito Discover Data extraction, semantic tagging, structured information loading, standard or customized taxonomy development, and automatic categorization Full-featured text analytics development platform; recently bought Temis, another leading text analytics company Google Cloud Platform; Accenture; Capgemini; Esri Cogito Studio Combines rules generation, ontology creation, taxonomy customization, extraction customization, and semantic network enrichment Luxid Annotation Server Extracts information using morphosyntactic reasoning, statistics, thesaurus-/taxonomy-/ontologybased extraction, machine learning, and rules-based extraction Luxid Webstudio Web-application for ontology management and semantic enrichment for maintaining a shared ontology http://www.expertsystem.com/products/ IBM SPSS ® Text Analytics for Surveys Watson Categorizes survey responses to turn survey text into quantitative data Commercial version of the Jeopardy-winning software. Initial module was for healthcare.
From page 115...
... examples of Commercially Available enterprise Search and Text Analytics products I-115 Company Product Description General Comments Sample Customers or Partners a Lexalytics Semantria Includes categorization and named entity extraction tools; integrates with Excel; provides output to use with BI tools Platform for text analytics applications; primary focus is on sentiment analysis but also has enterprise categorization and metadata capabilities https://www.lexalytics.com/semantria Luminoso Luminoso API Features include auto-tagging, classification, conceptual search, topic correlation, topic clustering, and predictive modeling Focus is on enhancing their automatic capabilities; very advanced technology; aim is to grow "common sense" brain Intel; Autodesk; Sony; Target; NASA; CDC; Scotts Miracle-Gro; TNS http://www.luminoso.com/products/api/ MeaningCloud Topics Extraction Entity and concept extraction using complex natural language processing techniques. User can create a customized dictionary to use Primary focus is on social media and sentiment analysis.
From page 116...
... I-116 improving Findability and Relevance of Transportation information Table I-E-2 (Continued)
From page 117...
... examples of Commercially Available enterprise Search and Text Analytics products I-117 Company Product Description General Comments Sample Customers or Partners a Pool Party Basic Server Taxonomy and thesaurus management with additional optional features Primarily a taxonomy management company; have added limited text analytics – mostly entity extraction and autoclassification Boehringer Ingelheim; Credit Suisse; Council of the European Union; Pearson; RedBull Media House; The World Bank; The Pokémon Company International; Wolters Kluwer Advanced Server Taxonomy and thesaurus management, linked data management, and ontology management included; concept tagging and semantic search are optional Enterprise Server All features of Advanced Server, plus concept tagging, text mining and entity extraction, and content recommender included; semantic search, data integration, and data analytics and visualization are optional Semantic Integrator All included and optional features of enterprise server are included https://www.poolparty.biz/product-overview/ Provalis Research QDA Miner Software for qualitative data analysis to code, annotate or search text or images; can extract information from text or images Primarily text as data, text mining, and entity extraction WordStat Text analysis tool includes text mining and visualization tools for theme extraction; provides ability to create taxonomies with words, patterns, and proximity rules; includes document classification tools; integrates with QDA Miner and requires installed version of either QDA Miner or SimStat ProSuite Integrates QDA Miner, WordStat, and SimStat (a statistical analysis product) ; integrates structured and unstructured data http://provalisresearch.com/products/ (continued on next page)
From page 118...
... I-118 improving Findability and Relevance of Transportation information Table I-E-2 (Continued)
From page 119...
... examples of Commercially Available enterprise Search and Text Analytics products I-119 Company Product Description General Comments Sample Customers or Partners a Smartlogic Classification Server Provides rule-based classification and natural language processing to tag metadata; Includes customizable entity and fact extraction Full development platform; focus is on ontology as structure Ontology Editor Users can model concepts, topics, structures and relationships in this ontology management platform; web-based interface allows for collaboration Search Application Framework Integrates Ontology Editor model into a customizable search engine, with functionality including taxonomy or entity-based facet navigators, topic maps, and filters Text Miner Highlights most used language to inform taxonomy and ontology development http://www.smartlogic.com/what-we-do/products-overview Verint Systems Verint ® Text Analytics™ Separates employee and customer streams using conversational analytics; provides automated theme discovery; includes interface for visualizations Primarily social and customer analysis; adding more text analytics http://www.verint.com/solutions/customer-engagement-optimization/voice-of-the-customeranalytics/products/text-analytics/
From page 120...
... I-120 improving Findability and Relevance of Transportation information Table I-E-3. Examples of free text analytics and text mining software.

Key Terms



This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.