Best in class digital workforce was one our goals in 2019 and we made several strides in that direction. Developer productivity is one measure of this goal and in that spirit – of empowering the developers with best-in-class tools and productivity enhancers – I got a chance to evaluate Sourcegraph. In this article I want to share my journey of this evaluation and my experience, primarily from a developer’s point of view.
Sourcegraph is a code search, and web based code intelligence tool for developers. It offers all of its features at scale in a large “space”:
- Public service: [24 programming languages] x [All opensource repos ] x [All repo hosts = Gitlab, Github, Bitbucket, AWS Code Commit] x [all branches]
- Private service: [24 programming languages] x [All private repos on self hosted server] x [Gitlab, Github, Bitbucket] x [all branches]
My view point into the evaluation was along the 3 pillars of SourceGraph: Search, Review and Automation. Describing the rich feature set of SourceGraph is an exercise in itself, instead I will try and make a case for why this tool stands out and how it improves the productivity of a developer.
Search: A developer while creating code, will have a need to look at a definition of method. Most of the times this definition may exist in the IDE (on laptop) and the search becomes a matter of remembering the name of the method and reach it using IDE’s search functionality. However it becomes trickier if one does not member the exact phrase or does not know what to look for and what if that snippet of code is not available on the local dev environment. One has to clone multiple repos in order navigate the definition, find references and complete the review. This is where I see Sourcegraph adds value where in, the developer by using advanced features like regex (that allows one to search a subset of the languages like Python, Go, Java), Symbol search (that enables searches only on variables and function names) and comby search (a more powerful search than regex that enables finding balanced parenthesis) is now empowered to perform (not limited to) the following actions, right in the browser:
- How an API should be invoked
- What is the impact of modifying an existing API
- Find variables starting with a specific prefix
- Find a function call and replace the argument (via comby search)
- Refactoring a monolith into microservice
- Learning new ways to write code from both internal private repos and opensource repos
- Learning enterprise standard ways to read tokens securely, pack PoP tokens in microservices and client side code
- Search through 1000s of private repos with GBs of data, where it is not always possible/efficient to clone them locally
- Where are the environment configs declared
- Find specific toggles across the source
- How to implement a given algorithm
- Use GraphQL APIs (code-as-data) to power internal telemetry around source code metadata
- What recently changed in the code about (feature, page, journey etc…) that broke it? One can search commit diffs and commit messages
- Use predefined searches curated for you or your team/organization. These can also be used to send you alerts/notifications when developers add or change calls to an API you own
- Search for instances of secret_key in the source
repo:^github\.com/gruntwork-io/ file:.*\.tf$ \s*secret_key\s*=\s*".+"
SourceGraph enables all the above my offering search the spans over multiple code hosts (Github, Gitlab, Bitbucket and many other), multiple repos of each host, across all branches of a repo and across both Open Source repos and the privately held Enterprise Repos – it should be noted that searching across both open source and private enterprise repos is currently not possible with the same single search. It has to be 2 different search queries. Also searching code is not the same as searching text where if you search for “http request” on github, the results end up with a bunch of noise that includes “httprequest” and “http_request” as well. All of these contribute to shipping code faster.
Review: With the advances in source code hosting tools like github and gitlab and the like, code reviews have become less geekier where the reviewer receives a notification of the review, opens the review in a browser which renders the diff, the reviewer comments, approves/denies the request and moves forward with the workflow, all inside a browser. Everyone realizes the importance of this phase/step which offers the opportunity to catch non-mechanistic patterns, avoid costly mistakes (logical/semantic errors, assumptions etc…) and is also a medium of knowledge transfer. SourceGraph goes a couple of steps further and empowers the reviewer with source code navigation right from the browser where you can extend and decorate code views using Sourcegraph extensions
- Navigate source code as if in an IDE. Hover mouse onto a method name to “Go to Definition”. This is a huge time saver and makes the process efficient (through their indexing algorithm)
- In a similar vein, in addition to seeing the definition of method, the reviewer might want to check who else is using/referencing it in order to assess the impact of the change. SG offers “Find references” in all repos that are indexed on the server.
- Note SG is self-hosted (tm/sourcegrapheval) in order to index all enterprise private repos, so the code never leaves the network. However for searching non-enterprise opensource code, there is a publicly available cloud service at sourcegraph.com. I would have loved the option where the private server falls back on the public server seamlessly, but may be SG team will make that available in the future releases.
- Corroborates the change with test coverage numbers and runtime traces. While the above 2 points (“Go to Definition” and “Find References”) constitute 80% of the use cases, the reviewer can feel more confident of the change when the review also offers test coverage numbers. SG offers that and goes one step further by making available trace performance numbers from runtime (via supported services that have to be enabled)
Automation: I discussed 2 pillars so far: Search and Review. Now think about performing those acts at enterprise scale with appropriate roles and visibility via workflows. SG offers a beta feature called Automation which does exactly that: remove legacy code, fix critical security issues, and pay down tech debt. Ability to create campaigns across thousands of repositories and code owners. Sourcegraph automatically creates and updates all of the branches and pull requests, and you can track progress and activity in one place. This is huge. Imagine that scale! This capability enables the following use cases:
- Remove deprecated code: Monitor for legacy libraries, and coordinate upgrading all the affected repos iteratively
- Triage critical issues
- Dependency updates
- Enable distributed adoption
- Reduce the cost and complexity of sunsetting legacy applications
Essentially, it offers the sum total of all code intelligence (enterprise wide) in a split second via its search! Upping the ante for developer experience. Here is a quick featureset comparison with other similar solutions: https://about.sourcegraph.com/workflow/#summary-feature-comparison-chart
Proof of Concept
Installation was super simple where the distribution was made available as a docker container that I deployed on one of our EC2s. I was also quickly able to integrate with Okta (our AuthN service) pretty seamlessly and my colleagues from whole enterprise were able to play around and tried some use cases. Once the adoption improves, I plan to deploy Sourcegraph on a Kubernetes cluster.
While integrating with Okta, I noticed a small issue with SAML handshake and when I mentioned it to the Sourcegraph team, they hopped on a call, helped debug it, made a quick change in the product and provided a release candidate with the hotfix which I was able to upgrade to in a matter of few mins w/o losing any of the configurations done, thus far. Loved the experience!
A developer platform is the one place where Developers and DevOps teams go to answer questions about code and systems. It ties together information from many tools, from repositories on your code host to dependency relationships among your projects and application runtime information.-Sourcegraph
I am encouraged to see how it fits into T-Mobile’s strategy of creating an environment that facilities rapid experimentation and enabling faster change with a goal of empowering the developers and devops to build better software faster, safer and healthier.
In that spirit, T-Mobile made crucial moves in the last few years, towards optimizing the Continuous Delivery Platform. From custom CICD processes on prem, to industry standard processes on hosted solutions on prem, to cloudbased gitlab, a onestop shop for source code management, devops lifecyle along with devsecops. In Nov 2019, Sourcegraph and Gitlab announced (relevant MR) native integration offerring a big improvement to the developer UX. Although the Sourcegraph browser extension will continue work, the integration with GitLab simply means that developers unwilling to install browser extensions are able to enjoy valuable features seamlessly integrated with their GitLab workflow.
Note: The enterprise private repositories still have to be hosted on a private sourcegraph server, just that the experience is now delivered natively via gitlab workflows and not via browser plugins.
Given my positive experience and the immense potential, I plan to recommend Sourcegraph to our procurement team, in the hope of making it a reality to all Devs, Devops and Devsecops teams at T-Mobile.