hig.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automated Extraction and Retrieval of Metadata by Data Mining: a Case Study of Mining Engine for National Land Survey Sweden
University of Gävle, Department of Technology and Built Environment.
2010 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Metadata is the important information describing geographical data resources and their key elements. It is used to guarantee the availability and accessibility of the data. ISO 19115 is a metadata standard for geographical information, making the geographical metadata shareable, retrievable, and understandable at the global level. In order to cope with the massive, high-dimensional and high-diversity nature of geographical data, data mining is an applicable method to discover the metadata.

This thesis develops and evaluates an automated mining method for extracting metadata from the data environment on the Local Area Network at the National Land Survey of Sweden (NLS). These metadata are prepared and provided across Europe according to the metadata implementing rules for the Infrastructure for Spatial Information in Europe (INSPIRE). The metadata elements are defined according to the numerical formats of four different data entities: document data, time-series data, webpage data, and spatial data. For evaluating the method for further improvement, a few attributes and corresponding metadata of geographical data files are extracted automatically as metadata record in testing, and arranged in database. Based on the extracted metadata schema, a retrieving functionality is used to find the file containing the keyword of metadata user input. In general, the average success rate of metadata extraction and retrieval is 90.0%.

The mining engine is developed in C# programming language on top of the database using SQL Server 2005. Lucene.net is also integrated with Visual Studio 2005 to build an indexing framework for extracting and accessing metadata in database.

Place, publisher, year, edition, pages
2010. , p. 45
Keywords [en]
data mining, geographical metadata, ISO 19115, Lucene.net
National Category
Mineral and Mine Engineering
Identifiers
URN: urn:nbn:se:hig:diva-6811Archive number: TEX100724OAI: oai:DiVA.org:hig-6811DiVA, id: diva2:321624
Presentation
(English)
Uppsok
Technology
Supervisors
Examiners
Available from: 2010-06-11 Created: 2010-06-01 Last updated: 2010-06-11Bibliographically approved

Open Access in DiVA

fulltext(524 kB)2339 downloads
File information
File name FULLTEXT01.pdfFile size 524 kBChecksum SHA-512
e2d1f680e3651fe972eb43d5edacb19962d2a225550d7df8c9844cf0b5d350f10dd4b4a7729b793d926665b6da2e4f68bf3ac5ef46b19b2b02c790ce898f4ce9
Type fulltextMimetype application/pdf

By organisation
Department of Technology and Built Environment
Mineral and Mine Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 2339 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1267 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf