Skip to main content

Exercise 06: Complete Face Authentication System

This exercise integrates all components from exercises 1-5 into a working face authentication system. The application contains TODO markers where you'll implement the functionality you've learned.

🎯 Learning Objectives

  • Integrate image processing, embeddings, similarity computation, and storage
  • Build a functional face authentication system
  • Understand how components work together in a complete application

📚 Prerequisites

Before starting this exercise, you should have completed:

  1. Exercise 01 - Image Processing & ImageNet Normalization
  2. Exercise 02 - ConvNeXt Model & Embedding Generation
  3. Exercise 03 - Cosine Similarity & Face Comparison
  4. Exercise 04 - Local File Storage for Embeddings
  5. Exercise 05 - Vector Retrieval & Similarity Search

🔧 What You'll Implement

This application contains TODO sections that map directly to your previous learning:

TODOs from Exercise 01 (Image Processing)

  • Image preprocessing and normalization functions
  • Converting images to model-ready tensors

TODOs from Exercise 02 (Embeddings)

  • Model loading and initialization
  • Face embedding computation from images

TODOs from Exercise 03 (Similarity)

  • Cosine similarity calculation between embeddings
  • L2 normalization for embedding comparison

TODOs from Exercise 04 (Storage)

  • Local file storage implementation
  • Embedding persistence and retrieval

TODOs from Exercise 05 (Retrieval)

  • Similarity search across stored embeddings
  • Top-k retrieval for face matching

🚧 Implementation Status

The application framework is provided with TODO markers where you need to implement functionality from exercises 1-5:

  • Image processing (Exercise 01): image_with_std_mean function
  • Model loading & embeddings (Exercise 02): build_model and compute_embeddings functions
  • Similarity computation (Exercise 03): normalize_l2 and cosine_similarity functions
  • Storage system (Exercise 04): All LocalFileVectorStorage methods

Implementation approach:

  1. Follow the TODO comments and reference your previous exercises
  2. Implement incrementally and test each component
  3. See how individual pieces work together in a complete application

🚀 Application Features

The completed system provides:

  • Face embedding generation using ConvNeXt model
  • Local file storage for embeddings in JSON format
  • Real-time face registration from video stream
  • User authentication via face comparison
  • YAML-based configuration

🔍 Finding and Completing TODOs

Step 1: Locate TODO Markers

Search for TODO comments throughout the codebase. These mark the exact locations where you need to apply your knowledge from exercises 1-5:

# Find all TODOs in the project
grep -r "TODO" src/

Step 2: TODO Locations by Exercise

Exercise 01 - Image Processing:

  • src/image_utils/imagenet.rs: image_with_std_mean function

Exercise 02 - Embeddings:

  • src/embeddings/utils.rs: build_model and compute_embeddings functions

Exercise 03 - Similarity:

  • src/login.rs: normalize_l2 and cosine_similarity functions

Exercise 04 - Storage:

  • src/storage/local_file_vector_storage.rs: All storage methods (new, load_data, save_data, store_embedding, get_embedding, get_all_embeddings, delete_embedding)

Exercise 05 - Retrieval (Optional Enhancement):

  • src/login.rs: Optional similarity search optimization concepts

Step 3: Implementation Order

Recommended implementation order:

  1. Exercise 01: Image processing (needed for camera input)
  2. Exercise 02: Model loading and embeddings (core functionality)
  3. Exercise 04: Storage system (needed to save/load embeddings)
  4. Exercise 03: Similarity computation (needed for authentication)
  5. Exercise 05: Optional enhancements (similarity search optimizations)

Step 4: Test Your Implementation

After completing each exercise's TODOs, test incrementally:

# Test after each exercise implementation
cargo build

# Run the full application once all TODOs are complete
cargo run

Step 5: Integration Testing

Once all TODOs are implemented:

  1. Start the camera server (see Prerequisites section)
  2. Run cargo run
  3. Test registration: register → enter username → look at camera
  4. Test login: login → enter username → look at camera
  5. Verify similarity scores and authentication results

Installation

  1. Clone the repository
  2. Install dependencies:
    cargo build

Prerequisites

Camera Server Setup

Before running the face authentication system, you need to start the camera server:

  1. Navigate to camera server directory:

    cd ../camera_server
  2. Install Python dependencies:

    pip install -r requirements.txt
  3. Start the camera server:

    python camera_stream_api.py
  4. Verify camera stream: Open http://localhost:8000/video_feed in your browser

System Requirements

  • Camera: Webcam or external camera connected to your system
  • Python 3.7+: For the camera server
  • Rust 1.70+: For the main application

Configuration

The system uses config.yaml for configuration:

Storage Configuration

storage:
type: "local_file"
local_file:
path: "embeddings.json"

Usage

Running the Application

cargo run

Commands

  • register - Register a new user by capturing face embeddings
  • login - Authenticate an existing user
  • quit or exit - Exit the application

Note: Commands are entered without the / prefix (e.g., type register, not /register)

Registration Process

  1. Run the register command
  2. Enter a user name
  3. Look at the camera while the system captures multiple face samples
  4. The system will store embeddings in your configured storage

Authentication Process

  1. Run the login command
  2. Enter your registered user name
  3. Look at the camera for authentication
  4. The system will compare your face with stored embeddings

Storage

The system uses local file storage to store face embeddings in JSON format. This provides:

  • Simplicity: No external dependencies required
  • Reliability: Works offline and is easy to backup
  • Transparency: Human-readable JSON format for debugging

Configuration Options

Stream Configuration

stream:
url: "http://localhost:8000/video_feed" # Video stream URL
num_images: 5 # Number of samples to capture
interval_millis: 10 # Interval between samples
chunk_size: 8192 # Network chunk size

Model Configuration

model:
name: "timm/convnext_atto.d2_in1k" # Model name
embedding_size: 768 # Embedding vector size

File Structure

src/
├── main.rs # Main application entry point
├── config.rs # Configuration management
├── register.rs # Face registration logic
├── login.rs # Face authentication logic (includes TODO for Ex 03 & 05)
├── storage/ # Storage implementations
│ ├── storage.rs # Storage module exports
│ ├── vector_storage.rs # Storage trait and types
│ └── local_file_vector_storage.rs # TODO: Local file storage implementation (Ex 04)
├── embeddings/ # Embedding computation
│ ├── embeddings.rs # Module exports
│ └── utils.rs # TODO: Model loading and embedding computation (Ex 02)
├── image_utils/ # Image processing utilities
│ ├── image_utils.rs # Module exports
│ └── imagenet.rs # TODO: ImageNet preprocessing (Ex 01)
├── camera/ # Camera integration
│ ├── camera.rs # Module exports
│ └── camera_interactions.rs # Camera capture and streaming logic
└── config.yaml # Configuration file

Files marked with TODO contain implementations you need to complete based on exercises 1-5.

Dependencies

Core Dependencies

  • candle-core/candle-nn: Neural network framework for model inference
  • candle-transformers: Pre-trained model implementations (ConvNeXt)
  • hf-hub: Hugging Face Hub integration for model downloading
  • anyhow: Error handling and propagation

Data & Configuration

  • serde/serde_yaml/serde_json: Serialization for config and storage
  • uuid: Unique identifier generation for embeddings
  • chrono: Timestamp handling for embedding records

Camera & Streaming

  • reqwest: HTTP client for video streaming
  • image: Image processing and format handling
  • minifb: Window management for live video display

Utilities

  • clap: Command line argument parsing (for examples)
  • dotenv: Environment variable loading
  • lazy_static: Static configuration management

Troubleshooting

Video Stream Issues

  • Ensure the video stream URL is accessible
  • Check network connectivity
  • Verify the stream format is supported

Storage Issues

  • Ensure write permissions to the configured file path
  • Check that the directory exists or can be created

🚀 Extra Mile: Advanced Enhancements

Current Limitation

This implementation focuses on face embeddings only - it assumes input images already contain properly cropped and aligned faces. In real-world scenarios, you need face detection as a preprocessing step.

Enhancement Option 1: Complete Face Authentication Pipeline

Integrate face detection to build a complete pipeline:

  1. Add Face Detection: Use rustface - a Rust implementation of SeetaFace detection

    [dependencies]
    rustface = "0.1"
  2. Pipeline Flow:

    Raw Image → Face Detection → Face Cropping → Face Embeddings → Authentication
  3. Benefits:

    • Handle images with multiple faces or no faces
    • Automatic face cropping and alignment
    • More robust real-world deployment
    • Better user experience (users don't need to manually align faces)

Implementation Steps:

  • Add rustface dependency
  • Implement face detection in image preprocessing
  • Add face cropping and alignment
  • Handle edge cases (no faces, multiple faces)

Enhancement Option 2: Production-Grade Vector Storage

Replace JSON storage with Qdrant vector database:

  1. Add Qdrant Integration:

    [dependencies]
    qdrant-client = "1.0"
  2. Benefits Over JSON Storage:

    • Scalability: Handle millions of face embeddings
    • Performance: Optimized vector similarity search
    • Advanced Features: Filtering, clustering, hybrid search
    • Production Ready: Built for high-throughput applications
  3. Implementation:

    • Replace LocalFileVectorStorage with QdrantVectorStorage
    • Implement the same VectorStorage trait
    • Add Qdrant configuration to config.yaml
    • Use Qdrant's native similarity search instead of manual iteration

Why This Matters: JSON storage works for learning but doesn't scale. Production face authentication systems need vector databases to handle thousands of users efficiently.

Choose Your Challenge

  • Option 1 for computer vision enthusiasts who want to understand the complete pipeline
  • Option 2 for backend developers interested in scalable storage solutions
  • Both for a production-ready system!

📈 Learning Progression

This exercise integrates concepts from all previous exercises:

  • Exercise 01: Image preprocessing for neural network input
  • Exercise 02: Model loading and face embedding generation
  • Exercise 03: Similarity computation for face matching
  • Exercise 04: Data persistence for user storage
  • Exercise 05: Efficient similarity search

🎓 Key Learning Outcomes

You now understand:

  • Computer vision fundamentals and image preprocessing
  • Deep learning model integration for feature extraction
  • Vector mathematics for similarity computation
  • Storage systems for embedding persistence
  • System integration and production considerations

🚀 Next Steps

Continue your journey by:

  • Exploring advanced embedding models (FaceNet, ArcFace)
  • Scaling with vector databases (Qdrant, pgvector)
  • Adding security features (liveness detection)
  • Optimizing performance (GPU acceleration)
  • Building production applications

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This project is licensed under the MIT License.