Using Antlr to parse date ranges in Java and Kotlin

I wanted to parse date ranges that could occur as e.g. “01.01.” or “01.01.-05.01.” or “01.01.-05.01./09.01.” or similar combinations. To make it easier to correctly parse all possible combinations I have used Antlr to parse the dates. First I had to create rules in a file that I named “Dates.g4” that define what is a … Continue reading “Using Antlr to parse date ranges in Java and Kotlin”

I wanted to parse date ranges that could occur as e.g. “01.01.” or “01.01.-05.01.” or “01.01.-05.01./09.01.” or similar combinations. To make it easier to correctly parse all possible combinations I have used Antlr to parse the dates.

First I had to create rules in a file that I named “Dates.g4” that define what is a valid date range:

grammar Dates;
r: (element (divider? element)*);
element: (daterange | singledate);
daterange: date minus date;
singledate: date;
minus: '-' | '–';
divider: '/';
date: day '.' month ('.')?;
day: INT;
month: INT;
INT: [0-9]+;
WS: [ \t\r\n]+ -> skip ;

Let’s see what this does. The “grammar” line just defines a name. The next line defines a token “r” that can consist of an “element” and an arbitrary number of “divider” objects (or no divider) and another element. The next line defines what such an “element” is. It is either a “daterange” or a “singledate”. And so on, all tokens are defined this way. A question mark makes the element optional, i.e. it does not need to be in the parsed text.

The rules in uppercase letters are lexer rules, i.e. they don’t use self defined tokens to define the structure of the parsed text but they define characters that should be allowed.

I have used the Intellij IDEA IDE with the Antlr plugin. So to generate the necessary Java classes from the *.g4 file above I just had to right click the *.g4 file and choose “Generate ANTLR Recognizer”:

This creates several classes in the directory and package that you can change by clicking on “Configure ANTLR” in the menu above.

The generated classes are easy to use, e.g. in Kotlin:

val lexer = DatesLexer(CharStreams.fromString(text))
val parser = DatesParser(CommonTokenStream(lexer))

val parsed = parser.r()
for (element in parsed.element()) {

Inside the loop you can now access the dates e.g. with

element.daterange()

and

element.singledate()

because as defined in the *.g4 file above an element contains either a “daterange” or a “singledate”. As you can see the generated functions use the names that were specified in the *.g4 file.

Automatic uploads of Android apk files to Google Play

To save time you can automate the process of uploading new apk files to Google Play using tools like gradle-play-publisher or fastlane. The process is simple. First you have to create a Google Service account and download the created p12 file. In the Google Play console you have to configure the access for this account. … Continue reading “Automatic uploads of Android apk files to Google Play”

To save time you can automate the process of uploading new apk files to Google Play using tools like gradle-play-publisher or fastlane.

The process is simple. First you have to create a Google Service account and download the created p12 file. In the Google Play console you have to configure the access for this account. And you have to add a few sections to your project’s gradle files. You can find instructions for it on the gradle-play-publisher website.

Then you can download all your existing descriptions with this command:

./gradlew bootstrapReleasePlayResources

And to just upload a new apk file for a release or a alpha/beta version use this command:

./gradlew clean publishApkRelease

or this if you are using App Bundles

./gradlew clean publishBundle

It also uploads the contents of the “whatsnew” files as changelog to the Google Play store.

Parsing PDF files with Java or Kotlin

Often information is not available in a computer readable format like JSON, XML or CSV. When only a human readable PDF file is available, one can try to use a PDF parser to retrieve the needed information. There is e.g. Apache Tika that can read PDF files and return the contents as tokens. It can … Continue reading “Parsing PDF files with Java or Kotlin”

Often information is not available in a computer readable format like JSON, XML or CSV. When only a human readable PDF file is available, one can try to use a PDF parser to retrieve the needed information. There is e.g. Apache Tika that can read PDF files and return the contents as tokens. It can be quite useful but it doesn’t return tabular information so if you have a table with empty cells you don’t see which cells are empty and it can be difficult to know to which cell the returned data belongs to.

For this purpose another library called Tabula exists. It provides an easy to use local web page that allows to the tables of a PDF file and export them as CSV or JSON files:

Tabula screenshot running on localhost

You can also embed tabula-java into an own program to use it e.g. in batch jobs. E.g. this Kotlin snippet loads a PDF file pdfFile and writes its contents as JSON into tmpfile:

val tmpfile = File.createTempFile("pdfparser", "json")
val args = arrayOf(pdfFile.absolutePath, "-g", "-l", "-f", "JSON", "-o", tmpfile.absolutePath)

val parser = DefaultParser()
val cmd = parser.parse(CommandLineApp.buildOptions(), args)

val stringBuilder = StringBuilder()
CommandLineApp(stringBuilder, cmd).extractTables(cmd)

From there you can parse the JSON to process it further.