All of these organizations have developed APIs for creating, processing, transforming, storing, tracking and accessing machine learning models, their hyper parameters and big datasets. These APIs bring their teams together. These APIs act like verses of some scripture which they use day in day out. Stability, performance, speed, accuracy, interpretability, maintenance and team's growth all of these fall in place automatically. This makes the life of a data scientist easy. It brings joy in working and increases chances of internal innovation. This allows appreciation for employees because tools/libraries built by them eventually makes practical difference.
Fortunately, I had created a somewhat similar setup in my last startup - collection of speech files, naming convention, raw file storage on disk and in memory, versioning of speech models and hyper parameter, REST API, developer API doc, bug tracker, developer friendly environment and knowledge sharing sessions fully utilizing developer wiki etc. But with all humbleness, I say that there was scope for improvement. Today if I have to design that, I will design it in much better way - fulfilling almost all aspects of stability, performance, speed, accuracy, interpretability, maintenance and team's growth. Better technologies and well documented open source libraries are available today compared to about 4 to 5 years back. Microservices based solutioning has become easy due to availability of stable process virtualization and orchestration tools and platform virtualization technologies - which in turn solving scalability issues, separation of concerns and removing hardware/OS dependencies. Today, this is a proven ecosystem under which long term great results can be achieved in data science.