Credits & Info
The MISpace Hackathon project focuses on weather data science and machine learning. It includes full data processing pipelines, U-Net forecasting, visualization tools, and a static GitHub Pages dashboard. We work with:
Our team built a complete data and machine learning pipeline that uses the MiSpace Network Common Data Form (NetCDF) dataset to generate accurate, human readable ice concentration forecasts for the Great Lakes. The system includes data preparation, a forecasting model, a visualization pipeline, and documentation that explains how each part works. Together, these components form an end-to-end tool that aligns with the operational needs of the United States Coast Guard (USCG) ice mission.
We began by working with the provided NetCDF files from January 11 to 31. These files contain daily 1024Γ1024 environmental grids with latitude, longitude, and a temperature variable that represents ice concentration. Darren analyzed the dataset early in the project and reviewed the related documentation. His work helped the team understand the structure of the historical data, the test data, and the goals of the challenge. He also outlined the steps required for success and continued providing reviews and guidance as the technical system developed. Elijah organized all the raw NetCDF files into a clear directory structure and verified that the 21 days formed a continuous time series. This gave us a clean and consistent dataset for supervised learning.
Using this organized dataset, we built the forecasting system. Marcos developed the data analysis and machine learning pipeline. He processed the January NetCDF files, created tools to load and visualize the 1024Γ1024 grids, and assembled the seven day training sequences. He implemented the U-Net (Universal Network) model and trained several versions of it over multiple weeks. We reviewed each generation of the model and looked for issues such as drift, loss of detail, or unrealistic melting patterns. Through repeated tuning and validation, the model became stable and produced realistic spatial forecasts. Marcos then used the January 25 to 31 window to generate predictions for February 1 to 4. Each prediction was saved as a high resolution PNG and added to a GIF animation for easy interpretation.
Diego built the frontend using HyperText Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript (JS). He designed a GitHub Pages dashboard that displays the model outputs clearly and makes the forecasts easy to understand. His work focused on user experience and visual clarity. He also created the projectβs demo video to summarize the workflow and display the predictions.
The final forecasting system takes the previous seven days of ice concentration and predicts the next full 1024Γ1024 map. Multi day forecasting is done by feeding each prediction back into the next input window. We used a fixed 0 to 6 color scale that matches the dataset so the images remain consistent across days. Darker blue represents higher ice concentration and lighter blue represents lower concentration or open water. The maps are flipped vertically so north appears at the top, which makes them easier to interpret for operational users. The GIF animations show how the predicted ice evolves across several days.
This project meets the hackathon requirements by using only the provided MiSpace NetCDF dataset and transforming it into a working Great Lakes ice forecasting tool. We prepared the data carefully, trained and refined the model over several weeks, produced scientifically meaningful images, and integrated everything into a clear user facing dashboard. The idea is practical and extends the value of the dataset by converting static historical records into short term forecasts that could support USCG routing and safety decisions.
Overall, our work demonstrates that machine learning can provide short term Great Lakes ice prediction using the supplied dataset. The final system produces accurate and readable maps and offers a foundation that can be expanded for future operational use.
Download NetCDF files from weather data sources (NOAA, ECMWF, NASA, etc.)
Use xarray or netCDF4 to load and explore the dataset structure
Handle missing values, outliers, and data quality issues
Create derived features like temperature gradients, moving averages, seasonal indicators
Train machine learning models on processed data
Assess model performance and create visualizations
MISpaceHackathon/
βββ assets/
β βββ css/
β βββ style.css
β
βββ data/
β βββ raw/ # (optional; primary raw data lives outside repo)
β βββ processed/ # ML-ready arrays (X.npy, y.npy)
β βββ external/
β
βββ notebooks/
β βββ 01_data_exploration.ipynb
β
βββ src/
β βββ daily_visualizations/ # Jan11-Jan31 PNGs
β β
β βββ data_processing/
β β βββ load_and_visualize.py # Generate daily PNGs and cache raw arrays
β β βββ inspect_nc.py # Examine structure and metadata of NetCDF files
β β βββ downsample_data.py # Create lower-resolution datasets for fast experiments
β β βββ processor.py # Utility class for NetCDF and shapefile preprocessing
β β βββ nc_visualizer_outputs/ # Saved figures from netCDF visualization scripts
β β
β βββ models/
β β βββ predict_unet.py # Run trained U-Net to produce February predictions and GIFs
β β βββ train_unet.py # Train U-Net for 5 epochs using (X,y) processed arrays
β β βββ checkpoints/ # Model weights saved after each epoch
β β
β βββ utils/
β β βββ gif_utils.py (optional)
β β
β βββ predictions_ver_*/ # Feb predictions + GIFs
β
βββ index.html
βββ data-science.html
βββ README.md