ABSTRACT: Building a Source Code Mining Tool Using Java and Solr
Mining source code repositories for useful insights is a practical way for developers to experiment with readily available data. One such problem is determining which developers have worked the most with particular clients or tools. In an organization that has been around for a while, information is often spread across different locations (e.g. CVS, subversion, Git, old emails, Sharepoint)- each require different query tactics. This talk demonstrates how to build a Java application which indexes Git repositories in a Solr full-text index, providing useful analytics through faceted search.
We found that this tool could identify which engineers worked on different projects fairly accurately. While this is already well-known information within the organization, it provides a useful demonstration of configuring full-text search for developers interested in the subject.
SPEAKER BIO: Gary Sieling
Gary Sieling is a Sr. Software Engineer at Wingspan Technology. He focuses on delivering large enterprise applications into complex client environments. In past projects, he’s worked on data-warehouse backed products, and is proficient in a wide range of tools, including Java, ExtJS, Oracle, and Postgres. Gary holds a BS in computer science degree from the Rochester Institute of Technology. He writes regularly at www.garysieling.com, and is also a regular contributor on architects.dzone.com.
MEETING SLIDES: Slides are available on Gary’s site here