PubMed2XL 2.0 now available for download
PubMed2XL 2.0 is now available. You can read the documentation and download the newest version here. There are a few notable changes to the graphical user interface (GUI) and lots of huge changes...
View ArticlePyEDS 0.2: a casual update
Update, September 25, 2013: In the interest of "transparency" (ironically, a word ever more hollow), I re-versioned my script and started it at 0.0 instead of 1.0, hence the code below is for version...
View Articletoo many brackets are being typed in the dark: XPath-like expressions for...
When I first started learning how to program and parse data, the data formats I first got acquainted with were text delimited files and XML. I avoided JSON as long as I could because I found (find?)...
View Articlegetting started with the Summon API and Python
Why is it I always think clearest when I'm home sick? That's a post for another time (workplace, structure without structure, working without having showered, etc.) … But for now, I'm working on...
View ArticleMuSEarch: searching within embedded MuseScore.com scores
OK, gonna try and keep this short. I've been obsessing the last few days over the Christmas holiday on this micro-project/demo … i.e. I been working from morning until midday and taking a shower and...
View ArticleI KEA, you Maui: some term extraction distraction
I was at the ALA annual conference last week. I came back from San Francisco (SF) on Tuesday morning. I left SF at 9pm their time and came back Tuesday at 9am Charleston time. In my age, I've attained...
View Articleredacting naughty words in images with Tesseract, ImageMagick, and dish soap
A major part of the current work I'm doing is to use some natural language processing tools and good old regular expressions to try and identify instances of PII (personally identifiable information)...
View Articlehightailing it out of None with lxml
I've been having to do a lot with XML and Python's lxml library for my work. Some of the XML files we are processing are in the 5-10 gigabyte territory. And, well, that kinda sucks in the first place...
View Articlea faulty PREMIS
A few months ago I was working on some METS and PREMIS 3 stuff for my current gig. While creating sample METS files containing PREMIS, I saw a discrepency between the PREMIS 3 Data Dictionary and the...
View Articlecan version control help with preservation metadata?
I wrote in my last post that I'm doing some stuff with PREMIS for my work and that I had some issues with PREMIS' structure. What I didn't mention was my concern about the overhead in creating PREMIS...
View Article
More Pages to Explore .....