-
Notifications
You must be signed in to change notification settings - Fork 2
How Dataverse Handles Shapefiles
A shapefile is a set of files, often uploaded/transferred in .zip format. This set may contain up to 15 files. A minimum of 3 specific files (.shp, .shx, .dbf) are needed to be a valid shapefile and a 4th file (.prj) is required for WorldMap--or any type of meaningful visualization.
For ingest and connecting to WorldMap 4 files are the minimum required:
- .shp - shape format; the feature geometry itself
- .shx - shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
- .dbf - attribute format; columnar attributes for each shape, in dBase IV format
- .prj - projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format
- .zip is unpacked (same as all .zip files)
- Shapefile sets are recognized by the same base name and specific extensions.
- Example. These individual files constitute a shapefile set. The first four are the minimum required (.shp, .shx, .dbf, .prj)
- bicycles.shp (required extension)
- bicycles.shx (required extension)
- bicycles.prj (required extension)
- bicycles.dbf (required extension)
- bicycles.sbx (NOT required extension)
- bicycles.sbn (NOT required extension)
- Upon recognition of the 4 required files, the dataverse will:
-
Group them as well as any other relevant files into a shapefile set
- Files with these extensions will be included in the shapefile set:
-
required:
"shp", "shx", "dbf", "prj" -
optional:
"sbn", "sbx", "fbn", "fbx", "ain", "aih", "ixs", "mxs", "atx", "cpg", "shp.xml"
-
required:
- Files with these extensions will be included in the shapefile set:
-
Create a new .zip with "mimetype" as a shapefile.
-
The shapefile set will persist as this new .zip
-
Connected to this new set, a shapefile metadata block will be created containing file info: name, size, date
A file named bikes_and_subways.zip is uploaded to the Dataverse. This .zip contains the following files.
- bicycles.shp (shapefile set #1)
- bicycles.shx (shapefile set #1)
- bicycles.prj (shapefile set #1)
- bicycles.dbf (shapefile set #1)
- bicycles.sbx (shapefile set #1)
- bicycles.sbn (shapefile set #1)
- bicycles.txt
- the_bikes.md
- readme.txt
- subway_line.shp (shapefile set #2)
- subway_line.shx (shapefile set #2)
- subway_line.prj (shapefile set #2)
- subway_line.dbf (shapefile set #2)
Upon ingest, Dataverse unpacks the file bikes_and_subways.zip. Upon recognizing the shapefile sets, it groups those files together into new .zip files:
- The files making up the "bicycles" shapefile become a new .zip
- files making up the "subway_line" shapefile will become a new .zip
- The remaining files will stay as the are.
To ensure that a shapefile set remains intact--individual files such as bicycles.sbn are kept in the set--even though they are not used for mapping.
- bicycles.zip (Contains shapefile set #1: bicycles.shp, bicycles.shx, bicycles.prj, bicycles.dbf, bicycles.sbx, bicycles.sbn)
- bicycles.txt (separate, not part of a shapefile set)
- the_bikes.md (separate, not part of a shapefile set)
- readme.txt (separate, not part of a shapefile set)
- subway_line.zip (contains shapefile set #2: subway_line.shp, subway_line.shx, subway_line.prj, subway_line.dbf)
For two "final" shapefile sets, bicycles.zip and subway_line.zip, a new mimetype is used:
-
Mimetype:
application/zipped-shapefile -
Text for user: "Shapefile as ZIP Archive"