This R package helps streamline the digitization of plots. It has two modes: 1) manual selection for use with scatterplots, a fork of the digitize package, and 2) automatic extraction for use with well-behaved lineplots, building on the magick library. It currently works with three bitmap image formats (jpeg, png, bmp), automatically detecting the image type using the package readbitmap.
You must have devtools installed.
To install from github:
if(!require(devtools)) install.packages('devtools')
library(devtools)
devtools::install_github("east-winds/digitize")Or, to install locally:
if(!require(devtools)) install.packages('devtools')
library(devtools)
devtools::install("path/to/digitize")library(digitize)
## make a temporary image
tmp <- tempfile()
png(tmp)
plot(x = 0:10, y = rnorm(11) + 0:10, xlab="x", ylab="y",
xlim=c(0,10),ylim=c(-1,11), type="l") + grid()
#> integer(0)
dev.off()
## auto-digitize figure using two calibration points and
# pre-specifying both x-axis and y-axis
# Select calibration points (0,0) and (10,10) in blue:
myfn <- digitize(tmp, x1=0, x2=10, y1=0, y2=10, twopoints=T, auto=T)#> ...careful how you calibrate.
#> Click IN ORDER: x1y1, x2y2
#>
#> Step 1 ----> Click on x1y1
#> |
#> |
#> |
#> y1
#> |______x1____________________
#>
#> Step 2 ----> Click on x2y2
#> |
#> y2
#> |
#> |
#> |_____________________x2_____
#>
#>
#>
#>
#> .....AUTOMATED INPUT.....
#>
#> Attempting to use `magick` to extract curve
# Plot returned spline function
x = seq(0,10,0.1)
plot(x, myfn(x), type='l', main = 'Extracted data')
Experimental: Extracting multiple lines
digitize(..., auto=T, lines=2) will attempt to extract two lines from the same graph (works best for red and blue), returning a list of interpolation functions.
library(digitize)
## make a temporary image
tmp <- tempfile()
png(tmp)
plot(rnorm(10) + 1:10, xlab="x", ylab="y",
xlim=c(0,10),ylim=c(0,10), xaxs="i", yaxs="i")
dev.off()
#> RStudioGD
#> 2
## manually digitize figure,
# pre-specifying both x-axis and y-axis
mydata <- digitize(tmp, x1=0, x2=10, y1=0, y2=10, twopoints=T)#> ...careful how you calibrate.
#> Click IN ORDER: x1y1, x2y2
#>
#> Step 1 ----> Click on x1y1
#> |
#> |
#> |
#> y1
#> |______x1____________________
#>
#> Step 2 ----> Click on x2y2
#> |
#> y2
#> |
#> |
#> |_____________________x2_____
#>
#>
#>
#>
#> .....MANUAL INPUT.....
#>
#> Click all the data. (Do not hit ESC, close the window or press any mouse key.)
#>
#> Once you are done - exit:
#>
#> - Windows: right click on the plot area and choose 'Stop'!
#>
#> - X11: hit any mouse button other than the left one.
#>
#> - quartz/OS X: hit ESC
# Plot returned points
plot(mydata$x, mydata$y, main = 'Extracted data')
If you use the auto digitization feature, please reference the github repo: https://github.com/east-winds/digitize/
If you use the manual scatter plot features, please reference: https://github.com/tpoisot/digitize#citation
Contributions welcome.



