rOpenSci | Blog

All posts (Page 75 of 94)

Announcing pdftools 1.0

This week we released version 1.0 of the ropensci pdftools package to CRAN. Pdftools provides utilities for extracting text, fonts, attachments and other data from PDF files. It also supports rendering of PDF files into bitmap images. This release has a few internal enhancements and fixes an annoying bug for landscape PDF pages. The version bump to 1.0 signifies that the package has undergone sufficient testing and the API is stable....

Tesseract Update: Options and Languages

A few weeks ago we announced the first release of the tesseract package: a high quality OCR engine in R. We have now released an update with extra features. 🔗 Installing Training Data As explained in the first post, the tesseract system is powered by language specific training data. By default only English training data is installed. Version 1.3 adds utilities to make it easier to install additional training data....

High Performance CommonMark and Github Markdown Rendering in R

This week the folks at Github have open sourced their fork of libcmark (based on the extensive PR by Mathieu Duponchelle), which they use to render markdown text within documents, issues, comments and anything else on the Github website. The new release of the commonmark R package incorporates this library so that we can take advantage of Github quality markdown rendering in R. The most exciting change is that the library has gained an extension mechanism to provide optional rendering features which are missing from the commonmark spec....

The rOpenSci geospatial suite

Geospatial data - data embedded in a spatial context - is used across disciplines, whether it be history, biology, business, tech, public health, etc. Along with community contributors, we’re working on a suite of tools to make working with spatial data in R as easy as possible. If you’re not familiar with geospatial tools, it’s helpful to see what people do with them in the real world. Example 1 One of our geospatial packages, geonames, is used for geocoding, the practice of either sorting out place names from geographic data, or vice versa....

fauxpas - HTTP conditions package

HTTP, or Hypertext Transfer Protocol is a protocol by which most of us interact with the web. When we do requests to a website in a browser on desktop or mobile, or get some data from a server in R, all of that is using HTTP. HTTP has a rich suite of status codes describing different HTTP conditions, ranging from Success to various client errors, to server errors. R has a few HTTP client libraries - crul, curl, httr, and RCurl - each of which is slightly different....

Working together to push science forward

Happy rOpenSci users can be found at