Hello, On Tue, 7 Jul 2020 at 00:30, Kunal Mehta <[email protected]> wrote:
> > > hosting a *private node package repository* in some form, > > typically in a git repository, where only *vetted versions* of packages > are > > checked in. > > In theory this addresses the problems, but I think the biggest problem > is just the volume and quality of code that needs reviewing. > I've suspected vetting and quantity poses big challenges. To my surprise I haven't found any information on how that's done. To discuss this topic, I've made the subtask to the build step RFC: T257072 <https://phabricator.wikimedia.org/T257072> "Determine Node package auditing workflows" > Here's what the actual diff looks like: > <https://libup-diff.wmflabs.org/change/605082>. > Analyzing all that information is a superhuman task. A gerrit/gitlab like review interface would make it more approachable. The way Pnpm stores packages (uncompressed in a git repo) would enable that. Yarn stores .zip files, but a separate uncompressed repo could be used for code review, updates submitted as regularly reviewed patches. > I don't believe it's possible to review that much code on a regular > basis, reacting to the speed at which many npm packages move. We could > stop upgrading all the time, but that would effectively be forking and > IMO put us in a worse position. > 100% review wouldn't be sustainable IMO (causing developer burnout very quickly), but looking for specific patterns exhibited in malicious packages could be a successful approach to increase trust in the audited code. Patterns like: * An unmaintained repo receiving an update, which is a common solution to inject malicious code. * New packages added to the dependency tree. An interesting article in this regard: https://portswigger.net/daily-swig/new-npm-scanning-tool-sniffs-out-malicious-code I wonder if the npm-scan <https://github.com/spaceraccoon/npm-scan> tool mentioned therein has been evaluated. The repo of former malicious packages (npm-zoo <https://github.com/spaceraccoon/npm-zoo>) is also worth mentioning. I've collected a few notable incidents in the RFC under section A_few_examples_of_NPM_incidents <https://www.mediawiki.org/wiki/User:Aron_Manning/RfC:_Evaluate_alternative_Node_package_managers_for_improved_package_security#A_few_examples_of_NPM_incidents> . > I also note that it's impossible to review just the git changelog of a > package, because the npm maintainer can upload any arbitrary tarball of > code to npm, whether or not it matches the git repo. (This isn't > exclusive to npm, pypi, crates.io suffer from this problem too. > composer/packagist doesn't though.) > A tool looking for differences between the git repo and the npm tarball could be useful. It's possible though that many packages would require special treatment if the tarball isn't simple to map on the git repo. However, just a simple check to see if a new npm release has a corresponding git tag or release - or any commits at all - would catch injections done with a stolen NPM token. How do these alternative package managers address the quantity of npm > packages installed that need review? > None of the package managers can or intend to do code review apart from `npm audit`, available with all 3. It seems to me there is an expectation that PMs will protect us. It should be clarified that no tool can do that, the purpose of these tools is to give 100% control over what packages and versions are installed. What these PMs provide is detailed in the RfC for evaluating alternative PMs <https://www.mediawiki.org/wiki/User:Aron_Manning/RfC:_Evaluate_alternative_Node_package_managers_for_improved_package_security#Package_managers> What versions we add to the local package repo is up to us. Separate ticket: T257072 <https://phabricator.wikimedia.org/T257072> I think a 2 stage deployment process would subject packages to as much scrutiny as possible within the constraints: 1. An auditing package repository with all the updates to be vetted. This would be used in sandboxed environments to expose updates to developers, who could notice outstanding behavior, warning signs. 2. A stable package repository for CI and not sandboxed environments. The review process: 1. The auditing repo only includes packages used in WMF projects. 2. Package versions need to be greenlighted for auditing too. This is preceded by a basic check of the validity of that version to look for eg, stolen credential injections, but code is only reviewed if suspicious, eg. new packages, unexpected updates. 3. Package versions would stay in this stage for some time (eg. 2 weeks), depending on the package and urgency. 4. A changelog in the auditing repo tracks the newest updates, informing developers about what packages to pay attention to. 5. One of the developers dedicated to package vetting does a deeper review of the code. This should be aided by heuristic tools. 6. When confidence in a version is built, that version is greenlighted for the stable repo. Demian (Aron) _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
