Hands-On AI Part 3: The Anatomy of an AI Team
Published: 07/14/2017
Last Updated: 07/27/2017
A Tutorial Series for Software Developers, Data Scientists, and Data Center Managers
This article focuses on projects that require teams and how to find appropriately skilled contributors to your team.
Search for Experts
Begin by defining the skills and assistance that you need for your project. During this process, you may determine that one of the following scenarios applies to you:
- You are capable of building an AI application yourself and can clearly define the project requirements, but you need the specialized skills of additional individual contributors to speed up the project timeline or to achieve a better quality product.
- You are a product manager or visionary but you need more extensive overall technical help to fully define the project requirements, including determining the project specifications, tools, and skills.
If the app you want to build is mainstream, not requiring a lot of research and narrow expertise, but is merely an application of AI in a specific domain, how your team is composed is predictable. The following discussion outlines the profiles of the team members that you may need.
Roles
Who you add to your team depends significantly on the type of project you want to build. You may only need one individual contributor, or several more people.
Data Scientist
A data scientist commonly has the following required skills:
- Algorithm design and analysis
- Linear algebra
- Probability theory
- Mathematical statistics
- A scripting language (primarily Python*)
- A data modeling language (primarily Python, R*, MATLAB*, Mathematica*)
- A data management language (SQL and its derivatives such as Pig* and Hive*)
- Machine learning (classification, regression, and clustering)
- Relational database (PostgreSQL*, MySQL*, Oracle*)
- Version control (for example, Git*)
Additional desired skills may include advanced machine learning (topic modeling, search, graph mining, matrix factorization, time series analysis, structured learning, and so on).
Theoretically, a data scientist can work on any project. However, it makes sense to ask for experience in a specific industry, such as credit scoring for banking, digital advertising, or search engines. Examples of projects a data scientist may be involved with include:
- Build an offline scoring model (credit, churn, propensity to buy a product, and so on)
- Perform an offline clustering of a data set (clients, products, transactions, and so on)
- Recommendation engine
- Parts-of-speech tagging, entity detection
- Sentiment analysis
- Time-series forecasting
- Click-through-rate (CTR) prediction
Deep Learning Data Scientist
A data scientist that specializes in deep learning typically requires the following skills:
- Deep learning theory, such as word2vec, convolutional neural networks (CNNs), recurrent neural networks, long short-term memory networks (LSTM), and generative adversarial networks
- Deep learning frameworks, for example TensorFlow*, Caffe*, Theano*, MXNet*, Torch*, or Keras*
Like the data scientist role, deep learning experts specialize in different domains. For example:
- Computer vision, which requires knowledge of CNNs
- Natural language processing and text mining, which require knowledge of recurrent neural networks, LSTM networks, gated recurrent units (GRU), and word2vec
- Audio processing (speech, music) and machine translation, which require knowledge of recurrent neural networks (LSTMs, GRUs)
- Time series analysis, which requires knowledge of CNNs and recurrent neural networks (LSTMs or GRUs).
Typical projects include:
- Object detection from images
- Image tagging
- Target tracking in a video stream
- Machine translation
- Speech-to-text and speech generation
- Word similarity
- AI for a self-driving car
- Music generation
Data Analyst
A data analyst includes the same required skills as a data scientist. A data analyst can work on descriptive data analysis but may not be as comfortable with predictive analytics, such as machine learning (classification, regression, and clustering) and data mining.
Typical projects include:
- Metrics and reporting
- Building an analytical dashboard
- Traditional modeling tasks (for example, user base segmentation, classification, or regression)
Data Engineer
This role is directly related to the concept of big data. A data engineer can make a machine learning algorithm work at scale for a very large data set. While the data engineer’s skills overlap with those of a data scientist, there are some key differences.
Required skills include:
- Algorithm design and analysis
- A scripting language (primarily Python)
- Familiarity with machine learning theory (classification, regression, or clustering)
- Distributed machine learning framework (Spark*, Storm*, H2O.ai*)
- A data management language (SQL and its derivatives such as Pig and Hive)
- Relational database (PostgreSQL*, MySQL*, Oracle)
- Distributed systems and NoSQL* databases (Hadoop*, Cassandra*, HBase*, Riak*, Kafka*, Dynamo*, Redis*, MongoDB*, or ElasticSearch*)
- Version control (for example, Git)
- Cloud computing (for example, Amazon Web Services*, Microsoft Azure*, or Google Cloud*)
- Linux*
A data engineer is typically familiar with container technology, such as Docker*. Typical projects include data sets exceeding 100 GB that must be processed by streaming, such as IT sensors, video, or audio.
Development Operations (DevOps) Engineer
This person maintains the infrastructure of a project and needs solid software engineering skills, knowledge of operating systems, distributed systems, and cloud computing.
Required skills include:
- Algorithm design and analysis
- A scripting language (primarily bash)
- Relational database (PostgreSQL*, MySQL, Oracle)
- Distributed systems and NoSQL databases (Hadoop, Cassandra, HBase, Riak, Kafka, Dynamo, Redis, MongoDB, or ElasticSearch)
- Version control (for example, Git)
- Cloud computing (for example, Amazon Web Services, Microsoft Azure, or Google Cloud)
- Linux
- Container technology (for example, Docker)
- Computer networking (for example, TCP/IP protocol or DNS)
- Computer security (SSH protocol or VPN)
- Continuous deployment (Travis)
Desired skills include:
- A data management language (SQL and its derivatives such as Pig and Hive)
- API design
Typical projects are orchestrating machines in a cluster or configuring a consistent portable developer environment.
Additional Critical Roles
There are several more traditional roles that are crucial in software engineering projects. The following briefly covers their areas of responsibilities and expected contribution to a project.
Software Engineer (Back End)
Helps build an API, configure the database, and write business logic (for example, payments, authentications, notifications, or messaging). Some modern technologies to look for are Node.js*, Flask*, Django*, Ruby on Rails*, Akka*, Spray*, Go*, PostgreSQL*, MySQL*, MongoDB*, Redis, RabbitMQ*, and Docker*.
Software Engineer (Web Front End, Mobile)
Builds a user interface for applications that support interactivity. Some modern technologies to look for are React.js*, Angular.js*, D3.js*, Vue.js*, Node.js*, HTML, JavaScript*, CSS, Bower*, Gulp*, Less*, Bootstrap*, and jQuery*.
Designer
Focuses on the user experience or graphic design, or both. Typically creates an initial prototype, collaborates with the product manager to test it with users, and prepares the final materials for a front-end engineer. Some of the modern technologies to look for are Adobe Photoshop*, Sketch*, and InVision*.
Product Manager
Sets the product vision, works with the users to define the product specifications, and makes sure that all things fit nicely together. Ideally, this person can also take on the responsibilities of a system architect and define the technical specifications.
Project Manager
Coordinates all team members, makes sure deadlines are met, removes barriers impeding project work, communicates with the product owner or a business client, defines the project roadmap, selects tools for a team to use, and other invisible but important team infrastructure work.
System Architect
Converts product specifications and vision into technical specifications and engineering tasks for the team to execute. Makes sure all components fit together and the integration process runs smoothly by accurately defining APIs, dependencies, and so on. Must be a senior person on the team who has managed at least a few software projects from start to finish.
Domain Expert
Has a deep understanding of a specific industry or vertical. Typically, the information coming from a domain expert is useful during feature engineering for machine learning and data annotation.
Quality Assurance Analyst
Makes sure the application meets all declared specifications and works flawlessly. Quality assurance analysts are typically included in projects that have an interactivity or user interface. Can also check the output of a machine learning model and perform error analysis along with the data scientists and data annotation experts on the team.
Team Composition Framework
Now that you know the roles and skills involved, let’s look at a set of guiding questions you can use to help determine what kind of professionals you need on your team.
- Is it an AI or data project?
- If yes, add a data scientist or a data analyst.
- Does your project include images, speech, video, or a large text collection?
- If yes, add a data scientist who should ideally know deep learning.
- Do you need to prepare data for your project yourself?
- If yes, add a domain expert for data annotation.
- Does your app work with a very large data set and must it sustain a high load?
- If yes, add a data engineer.
- Does your app work offline or online?
- If online, add a DevOps engineer or a software engineer (back end) to your project. A data engineer can potentially do the online app deployment.
- Does your app have a user interface?
- If yes, add a front-end engineer and optionally a designer.
- How many components does your app have?
- If more than one, add a system architect, a quality assurance analyst, and multiply the number of data scientists by the number of components.
- How many people are on your team?
- If you have at least three people, add a project manager.
As described in Part 1, the movie-making app extracts emotions from uploaded images using an image processing (emotion recognition) algorithm, generates music that represents the extracted emotions, and then creates a movie that combines the images and music.
Given that information, let's apply the framework defined above:
- Is it an AI or data project?
- AI
- Does your project include images, speech, video, or a large text collection?
- Yes.
- Do you need to prepare data for your project yourself?
- Yes.
- Does your app work with a very large dataset and must it sustain a high load?
- No.
- Does your app work offline or online?
- Online.
- Does your app have a user interface?
- Yes.
- How many components does your app have?
- Two.
- How many people are on your team?
- Nine.
Based on the above information, the ideal team for this project is the following:
- Two deep learning data scientists
- One project manager, system architect, or designer
- One back-end software engineer
- One front-end engineer
- Two domain experts
- One quality assurance analyst
Since the project is on the small side and the app functionality isn’t complex, the roles of the product manager, system architect, and designer were combined into one. With this in mind, let’s move on to the expert search process and analyze the channels you can use to find these experts.
AI Talent Hiring Channels
Since AI and deep learning is a dynamic industry and little is known about how to assemble a strong team in this space, it’s important to at least enumerate existing hiring channels. We list both internal and external channels.
In a large organization, the expert search starts internally. If some of the required experts cannot be found in-house, the search expands to the open market. From there, a large organization and a team of independent developers can use the same tactics and channels.
To find internal talent
We suggest contacting the HR team and collaboratively defining the job description and an ideal candidate profile using the roles defined above. An HR team typically uses the following search methods:
- Advertising on an internal job board
- Advertising on a publicly facing corporate job board
- Asking colleagues from a relevant team (for example, an AI team or AI developer evangelists)
To search externally
HR teams may use any or all of the following methods:
- An online marketplace—specialized or general—such as Data Monsters, Datastars, Gigster, Upwork, LinkedIn, or ProFinder.
- Personal connections offline and via online social networks
- Referrals from existing team members
- Niche professional social networks and communities, for example, Kaggle* or GitHub*
- Job boards and job search engines
- Industry and top academic conferences and hackathons, such as the International Conference on Machine Learning (ICML) or the Conference on International Knowledge Management (CIKM) and many others.
- Visits to university labs
- Content marketing, blogging, and social media, such as Twitter or LinkedIn
Conclusion
Creating a team is, perhaps, the most important stage of a project, since it is the people who make things happen. To form an effective team, understand and define the expertise needed to develop your project, and then staff your team with these experts.
Prev: Ideation | Next: Project Planning |
Create Applications with Powerful AI Capabilities
Ideation
The Anatomy of an AI Team
Project Planning
Select a Deep Learning Framework
Select an AI Computing Infrastructure
Augment AI with Human Intelligence Using Amazon Mechanical Turk*
Crowdsourcing Word Selection for Image Search
Data Annotation Techniques
Set Up a Portable Experimental Environment for Deep Learning with Docker*
Image Dataset Search
Image Data Collection
Image Data Exploration
Image Data Preprocessing and Augmentation
Overview of Convolutional Neural Networks for Image Classification
Modern Deep Neural Network Architectures for Image Classification
Emotion Recognition from an Images Baseline Model
Emotion Recognition from Images Model Tuning and Hyperparameters
Music Dataset Search
Music Data Collection and Exploration
Emotion-Based Music Transformation
Deep Learning for Music Generation: Choosing a Model and Preprocessing
Deep Learning for Music Generation: Implementing the Model
TensorFlow Serving for AI API and Web App Deployment
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.