Back to projects

Technical depth / education

Mars surface image classification (CNN features + K-Means)

DCU coursework on the NASA Mars Surface Image Dataset (6,691 labelled Curiosity rover images across 25 categories). EfficientNetB0 for feature extraction, PCA for dimensionality reduction, K-Means for unsupervised clustering, with a silhouette score of around 0.12 to 0.15.

Employer / client
Dublin City University
Duration
EEN1083/EEN1085, 2024
Project type
Computer vision / clustering

Architecture

How the clustering pipeline works

Feature extraction and clustering pipeline: 6,691 labelled NASA Curiosity images feed EfficientNetB0 as a frozen feature extractor, PCA reduces the features to 50 components, K-Means clusters them without using the labels, and the clusters are evaluated with a silhouette score of 0.12 to 0.15 plus visual inspection.

How the rover images flow through EfficientNetB0 feature extraction, PCA reduction, and K-Means clustering, then get judged on silhouette score and cluster make-up.

My clustering pipeline

Process flow

How I work the steps

  1. 01
    before Load dataset

    NASA Curiosity images, 25 labelled categories.

    Public dataset
  2. 02
    control Extract features

    EfficientNetB0 as a frozen feature extractor.

    Me
  3. 03
    handoff Reduce + cluster

    PCA to 50 components, then K-Means.

    Me
  4. 04
    after Evaluate

    Silhouette ~0.12 to 0.15 plus cluster inspection.

    Me

How I built it

  • EfficientNetB0 used as a frozen feature extractor; PCA reduced the feature vectors to 50 components before clustering.
  • Evaluated cluster quality with silhouette score and visual inspection of cluster compositions, not just accuracy against the labels.
  • Code and notebook on GitHub so the analysis is reproducible end to end.

Measured results

What I measured

6.7K evidence

NASA Curiosity images clustered

K-Means on PCA-reduced EfficientNetB0 features produced meaningful groupings for visually distinct surface types (drill, wheels, horizon).

Findings

  • K-Means on PCA-reduced EfficientNetB0 features produced meaningful groupings for visually distinct surface types (drill, wheels, horizon).
  • Categories with high visual similarity (e.g. ground vs observation tray) overlapped, as expected. Written up rather than glossed over.
  • Silhouette score of about 0.12 to 0.15: positive separation, not clean.

Tools I used

  • Python
  • TensorFlow
  • EfficientNetB0
  • Scikit-learn
  • PCA
  • K-Means