🌦️ MISpace Hackathon

Credits & Info

Laker IceLabs Team Members:

Elijah Morgan - Data Reading

Darren Fife - Data Reading

Marcos Sanson - ML Model Development

Diego de Jong - Frontend Development

About This Project

The MISpace Hackathon project focuses on weather data science and machine learning. It includes full data processing pipelines, U-Net forecasting, visualization tools, and a static GitHub Pages dashboard. We work with:

  • NetCDF Files: Industry-standard format for weather and climate data
  • Weather Datasets: Temperature, precipitation, wind speed, and atmospheric pressure
  • Machine Learning: Predictive models for weather forecasting and pattern recognition
  • Data Visualization: Interactive charts and maps for weather data exploration

Technologies

Python NetCDF4 xarray Pandas Scikit-learn Matplotlib Jupyter

Project Write Up

Our team built a complete data and machine learning pipeline that uses the MiSpace Network Common Data Form (NetCDF) dataset to generate accurate, human readable ice concentration forecasts for the Great Lakes. The system includes data preparation, a forecasting model, a visualization pipeline, and documentation that explains how each part works. Together, these components form an end-to-end tool that aligns with the operational needs of the United States Coast Guard (USCG) ice mission.

We began by working with the provided NetCDF files from January 11 to 31. These files contain daily 1024Γ—1024 environmental grids with latitude, longitude, and a temperature variable that represents ice concentration. Darren analyzed the dataset early in the project and reviewed the related documentation. His work helped the team understand the structure of the historical data, the test data, and the goals of the challenge. He also outlined the steps required for success and continued providing reviews and guidance as the technical system developed. Elijah organized all the raw NetCDF files into a clear directory structure and verified that the 21 days formed a continuous time series. This gave us a clean and consistent dataset for supervised learning.

Using this organized dataset, we built the forecasting system. Marcos developed the data analysis and machine learning pipeline. He processed the January NetCDF files, created tools to load and visualize the 1024Γ—1024 grids, and assembled the seven day training sequences. He implemented the U-Net (Universal Network) model and trained several versions of it over multiple weeks. We reviewed each generation of the model and looked for issues such as drift, loss of detail, or unrealistic melting patterns. Through repeated tuning and validation, the model became stable and produced realistic spatial forecasts. Marcos then used the January 25 to 31 window to generate predictions for February 1 to 4. Each prediction was saved as a high resolution PNG and added to a GIF animation for easy interpretation.

Diego built the frontend using HyperText Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript (JS). He designed a GitHub Pages dashboard that displays the model outputs clearly and makes the forecasts easy to understand. His work focused on user experience and visual clarity. He also created the project’s demo video to summarize the workflow and display the predictions.

The final forecasting system takes the previous seven days of ice concentration and predicts the next full 1024Γ—1024 map. Multi day forecasting is done by feeding each prediction back into the next input window. We used a fixed 0 to 6 color scale that matches the dataset so the images remain consistent across days. Darker blue represents higher ice concentration and lighter blue represents lower concentration or open water. The maps are flipped vertically so north appears at the top, which makes them easier to interpret for operational users. The GIF animations show how the predicted ice evolves across several days.

This project meets the hackathon requirements by using only the provided MiSpace NetCDF dataset and transforming it into a working Great Lakes ice forecasting tool. We prepared the data carefully, trained and refined the model over several weeks, produced scientifically meaningful images, and integrated everything into a clear user facing dashboard. The idea is practical and extends the value of the dataset by converting static historical records into short term forecasts that could support USCG routing and safety decisions.

Overall, our work demonstrates that machine learning can provide short term Great Lakes ice prediction using the supplied dataset. The final system produces accurate and readable maps and offers a foundation that can be expanded for future operational use.

βš™οΈ Data Processing Workflow

  1. Data Acquisition:

    Download NetCDF files from weather data sources (NOAA, ECMWF, NASA, etc.)

  2. Data Loading:

    Use xarray or netCDF4 to load and explore the dataset structure

  3. Data Cleaning:

    Handle missing values, outliers, and data quality issues

  4. Feature Engineering:

    Create derived features like temperature gradients, moving averages, seasonal indicators

  5. Model Training:

    Train machine learning models on processed data

  6. Evaluation & Visualization:

    Assess model performance and create visualizations

πŸ“ Our Project Structure

MISpaceHackathon/
β”œβ”€β”€ assets/
β”‚   └── css/
β”‚       └── style.css
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                 # (optional; primary raw data lives outside repo)
β”‚   β”œβ”€β”€ processed/           # ML-ready arrays (X.npy, y.npy)
β”‚   └── external/
β”‚
β”œβ”€β”€ notebooks/
β”‚   └── 01_data_exploration.ipynb
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ daily_visualizations/         # Jan11-Jan31 PNGs
β”‚   β”‚
β”‚   β”œβ”€β”€ data_processing/
β”‚   β”‚   β”œβ”€β”€ load_and_visualize.py        # Generate daily PNGs and cache raw arrays
β”‚   β”‚   β”œβ”€β”€ inspect_nc.py                # Examine structure and metadata of NetCDF files
β”‚   β”‚   β”œβ”€β”€ downsample_data.py           # Create lower-resolution datasets for fast experiments
β”‚   β”‚   β”œβ”€β”€ processor.py                 # Utility class for NetCDF and shapefile preprocessing
β”‚   β”‚   └── nc_visualizer_outputs/       # Saved figures from netCDF visualization scripts
β”‚   β”‚
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ predict_unet.py              # Run trained U-Net to produce February predictions and GIFs
β”‚   β”‚   β”œβ”€β”€ train_unet.py                # Train U-Net for 5 epochs using (X,y) processed arrays
β”‚   β”‚   └── checkpoints/                 # Model weights saved after each epoch
β”‚   β”‚
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   └── gif_utils.py (optional)
β”‚   β”‚
β”‚   └── predictions_ver_*/            # Feb predictions + GIFs
β”‚
β”œβ”€β”€ index.html
β”œβ”€β”€ data-science.html
└── README.md