The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science.
The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions.
- Past, Present, and Future of Analyzing Software Data Part 1 TUTORIAL-TECHNIQUES
- Mining Patterns and Violations Using Concept Analysis
- Analyzing Text in Software Projects
- Synthesizing Knowledge from Software Development Artifacts
- A Practical Guide to Analyzing IDE Usage Data
- Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data
- Tools and Techniques for Analyzing Product and Process Data PART 2 DATA/PROBLEM FOCUSSED
- Analyzing Security Data
- A Mixed Methods Approach to Mining Code Review Data: Examples and a Study of Multicommit Reviews and Pull Requests
- Mining Android Apps for Anomalies
- Change Coupling Between Software Artifacts: Learning from Past Changes PART 3 STORIES FROM THE TRENCHES
- Applying Software Data Analysis in Industry Contexts: When Research Meets Reality
- Using Data to Make Decisions in Software Engineering:
- Providing a Method to our Madness
- Community Data for OSS Adoption Risk Management
- Assessing the State of Software in a Large Enterprise: A 12-Year Retrospective
- Lessons Learned from Software Analytics in Practice PART 4 ADVANCED TOPICS
- Code Comment Analysis for Improving Software Quality
- Mining Software Logs for Goal-Driven Root Cause Analysis
- Analytical Product Release Planning PART 5 DATA ANALYSIS AT SCALE (BIG DATA)
- Boa: An Enabling Language and Infrastructure for Ultra-Large-Scale MSR Studies
- Scalable Parallelization of Specification Mining Using Distributed Computing