With Year In Text, I summarize my projects and report my top books and other media from the year. You will find synopses, reviews, insights, and external purchase links texts spanning a variety of genres and subjects. Also find a sneak peak of what is coming up 2025. Please enjoy!
Blogs by Sean
(1/17/25) Year in Text 2024, a Sean's Technical Stuff Feature
Tags: Year In Text, Sean's Picks, 2024, newyear, synopses, reviews, project, summaries, projects
(7/30/24) The case for containers in computation
Tags: docker, dockerhub, engineering, software, cloud, dependencies, distribution, reproducibility
In this blog, we explore the following key questions:
- 1. What are containers and why are they important for users, developers, and researchers?
- 2. In programming and computational research, what are the benefits of container use and risks of not using them?
- 3. How does familiarity with containers benefit your career; what projects and fields aleady make extensive use of this important technology?
(2/17/23) Better benchmark workflows, and why you should use R with NextFlow
Tags: workflows, nextflow, parallelization, benchmarks, R, algorithms, deconvolution, high-throughput, data science
As workflow technologies continue to be updated and improved, their learning materials become more robust, and their support communities grow, there will be fewer barriers to using them to streamline day-to-day development routines, especially when dealing with complex parallel tasks. Yet relatively few learning resources cover the management of standard benchmarking tasks using workflows. Even fewer provide solutions for specific domains like bioinformatics and the R programming language. In this post, I attempted to address this issue with several solutions arrived at after considerable brainstorming, research, trial-and-error, and conversing with my stellar computational bioscience colleagues. I ultimately found that not only can R be used with NextFlow for benchmarking, but there are many domains where this probably should be the standard approach.
(10/13/21) Run R package checks with a shell script
Tags: developers, command-line, bash, scripting, R, checks, tests, automation
While checks are crucial to R package development, running them from command line can quickly become repetitive. I’ve written a shell script, rpackagecheck.sh
, that runs the standard steps to checking an R package. The script uses R CMD ...
to install, build, and check packages with any combination of the three major check types. This script can help discourage accidents, such as running check on a directory rather than a .tar.gz
file, and ultimately expedite your development workflow.
(2/13/20) Cracking the Monty Hall problem with brute force simulation
Tags: simulations, algorithms, R, statistics, data science
On a game show stage before you wait 3 closed doors, behind which have been deposited 2 goats and 1 prize, respectively. You are called on to pick a door to be opened to reveal either a goat or a prize. The host, Monty Hall, then reveals a goat behind one of the two remaining unpicked doors. You are then given the option to switch your door selection to the final unpicked door before the big reveal. What should you do?
(11/12/18) My 2018 proposal for the Better Scientific Software Fellowship
Tags: computer-science, software, research, fellowships, funding, grants
I wanted to share my proposal for the 2018 Better Scientific Software (BSSw) Fellowship. BSSw aims to increase and preserve integrity and standards for publishing computer code in science, and their fellowship program recognizes and supports advocates of this cause. You may or may not be aware that we currently lack standard ways of referencing published code in science as independently citable units. Furthermore, vital source code for experiments can be distributed in many places, including supplemental materials sections behind paywalls, personal websites that may become inaccessible or go offline over time, and repositories on GitHub or elsewhere that may not include inherent and persistent identifiers. I propose using an autocompilation technology to aggregate published scientific computer code and code metadata into a new database, called Pubsrc. This will enable novel assessments of scientific code use, including automatic generation of dependency usage networks, tracking the impact of newly discovered software bugs throughout research, and making scientific code independently citable. I hope you enjoy reading my proposal, and please share or tweet this post if you support this cause.
(11/24/15) Decoding the Biotech Job Listing Part 1: "Good Communication", Collaboration, and Coevolution
Tags: networking, applications, job-hunt, biotech, research, industry
The requirements of "good communication skills" and "comfortable working in a multidisciplinary setting" on a biotech job listing are common, but what do these mean in practice? For work in biotech, there's often more than meets the eye in a job posting.
(06/30/15) Finding the right job fit in Biotech
Tags: networking, biotech, research, industry
When it comes to finding the right job fit, in biotech or elsewhere, an initial step is to simply identify what factors are most important.
But one's idea of the right job fit can change over time as the economy shifts, new experiences are acquired, and professional networks (analogous to Titz et al 2008's protein-protein interaction network at right) expand. When the ideal job is not available, the best fit can be a position that leads in the right direction. Practically speaking, this means the factors that contribute to a good job fit can change over time and need to be repeatedly reassessed.
(05/22/15) In Seattle, Biotechnology Is On the Upswing
Tags: biotech, research, industry, PNW, Seattle
The biotech industry is booming. Some reservations notwithstanding, influxes of venture capital reflect economic robustness in the biotech market, and the recent pledge of funds for personalized (or "precision") medical research by the Obama administration reflect widespread public interest in the industry.
(05/15/15) Using the Informational Interview to Land a Job You'll Love
Tags: network, career, interviews, biotech, research, industry, nonprofit, academia
You're finishing your program and want to find a rewarding job, but nothing seems to work. How can you improve your chances?
If only there was something to be done between filling out forms and waiting for replies, something that could help you in your pursuit. The job application process can be arduous. It doesn't help that online applications can make the process seem faceless. Maybe your strengths aren't reflected well on paper. Maybe you find your strength lies in conversation, but the job interview seems too confined and canned for you to really shine. If "networking" and "job interview" seem like mutually exclusive concepts, it is time to reconsider.