Day 1: AI Foundations and Basic ML Concepts
Module 1: Introduction to AI in Software Development
- AI vs. Conventional Systems
- Narrow, General, and Super AI
- AI hardware (GPUs, TPUs) and popular frameworks
- AI in the SDLC
- Benefits (automation, predictive insights) and risks (maintenance, data quality)
- AI as a Service (AIaaS) and service contracts
Module 2: Quality Characteristics and Ethics in AI
- Key Quality Factors
- Flexibility, Adaptability, Autonomy
- Transparency & Explainability
- Ethical & Regulatory
- Bias, Reward Hacking, and compliance (e.g., GDPR)
- Risk Management
- Identifying and mitigating biases
- Documentation of AI components
Module 3: Machine Learning Overview
- ML Types
- Supervised, Unsupervised, Reinforcement Learning
- ML Workflow
- Data collection, preprocessing, model training, evaluation, deployment
- Overfitting & Underfitting
- Causes, detection, and mitigation (e.g., regularization)
Lab 1: Overfitting & Underfitting (Titanic Dataset)
- Scenario: Predict passenger survival on the Titanic.
- Goal: Demonstrate how model complexity influences performance.
- Lab Steps:
- Load and preprocess Titanic data.
- Train multiple classifiers (e.g., logistic regression vs. random forest).
- Observe overfitting/underfitting effects on validation accuracy.
Day 2: Data Handling, Regression, and Prompt Engineering
Module 4: Data Preparation & Handling
- Data Quality
- Cleaning, handling missing values, outliers, categorical features
- Train/validation/test splits
- Common Pitfalls
- Imbalanced classes, mislabeled data, domain knowledge gaps
Lab 2: Data Preparation (NYC Taxi Dataset)
- Scenario: Forecast taxi fares in NYC (regression).
- Goal: Clean a real-world dataset and create meaningful features.
- Lab Steps:
- Load NYC Yellow Cab data (pickup/dropoff times, distances).
- Handle missing data and detect outliers.
- Engineer features (e.g., time-of-day, trip distance).
Module 5: Model Evaluation Metrics
- Regression Metrics
- Classification Recap
- Accuracy, precision, recall, F1-score, confusion matrix
- Choosing the Right Metric
- Contextual needs (business value, safety-critical)
Lab 3: Regression Modeling (NYC Taxi Fares)
- Scenario: Build and evaluate models to predict fare amounts.
- Goal: Compare linear regression vs. gradient boosting to measure error rates.
- Lab Steps:
- Train at least two regression models on NYC Taxi data.
- Compute MSE, RMSE, and MAE on the validation set.
- Discuss feature importance and next steps.
Module 6: Prompt Engineering for Generative AI
- Prompting Best Practices
- Role-based prompting, zero-shot vs. few-shot, chain-of-thought reasoning
- Structuring prompts for clarity, constraints, and context
- Iterative refinement (synonyms, repeated keywords, output format)
Lab 4: Designing a Sophisticated Prompt for Software Engineering
- Scenario: Generate detailed, actionable advice on software architecture, testing, or refactoring in a microservices environment.
- Goal: Apply advanced prompting techniques (role prompting, constraints, few-shot examples) to create a high-quality prompt that yields expert-level recommendations.
- Lab Steps:
- Define Context & Role (e.g., “You are a principal software architect…”).
- Provide Examples (show how you want the answer structured or styled).
- Add Constraints (limit response length, include specific bullet points).
- Iterate & Refine (test and adjust wording for clarity & precision).
Day 3: Neural Networks, Explainability, and Responsible AI
Module 7: Neural Networks Introduction
- NN Basics
- Perceptrons, hidden layers, activation functions
- NN Use Cases
- Images, text, speech; large-scale data
- Testing Neural Networks
- Special considerations, coverage measures
Lab 5: Neural Network Classification (MNIST)
- Scenario: Classify handwritten digits from MNIST.
- Goal: Implement and train a feed-forward neural network.
- Lab Steps:
- Load MNIST images (28x28).
- Build a simple network (e.g., feed-forward).
- Evaluate accuracy and discuss improvements (layers, dropout, etc.).
Module 8: Testing & Model Explainability
- Levels of Testing
- Input data testing, model testing, system & acceptance testing
- Adversarial Attacks & Data Poisoning
- Defenses, monitoring strategies
- Explainability Methods
- LIME, SHAP, local vs. global interpretation
Lab 6: Model Explainability (U.S. Housing with LIME)
- Scenario: Stakeholders want insights into house pricing predictions.
- Goal: Use LIME to explain predictions of a regression model.
- Lab Steps:
- Train a regression model on a U.S. housing dataset (e.g., Ames Housing).
- Apply LIME to interpret specific predictions.
- Identify potential biases or anomalies in model behavior.
Module 9: Responsible AI & Wrap-Up
- Governance & Compliance
- Privacy, fairness, disclaimers, accountability
- Future Trends
- Large Language Models (LLMs), multi-modal AI, MLOps
- Key Takeaways
- Data and model versioning, transparency, bias mitigation, robust QA
Summary of Labs
- Lab 1: Overfitting & Underfitting (Titanic)
- Lab 2: Data Preparation (NYC Taxi)
- Lab 3: Regression Modeling (NYC Taxi Fares)
- Lab 4: Designing a Sophisticated Prompt for Software Engineering (GenAI)
- Lab 5: Neural Network Classification (MNIST)
- Lab 6: Model Explainability (U.S. Housing with LIME)