diff --git a/README.md b/README.md index 33a2e458..168356e9 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,491 @@ -# ML4SCI.github.io +
-Machine Learning for Science github site +ML4SCI GSoC Logo -- Live at https://ml4sci.org +# Machine Learning for Science (ML4SCI) -## About ML4SCI +### Google Summer of Code β€” Official Organization Repository -Machine Learning for Science (ML4Sci) is an open-source organization that brings together modern machine learning techniques and applies them to cutting edge problems in Science, Technology, Engineering, and Math (STEM). +[![GSoC](https://img.shields.io/badge/GSoC-2026-4285F4?style=flat-square&logo=google&logoColor=white)](https://summerofcode.withgoogle.com/) +[![Website](https://img.shields.io/badge/Website-ml4sci.org-brightgreen?style=flat-square)](https://ml4sci.org) +[![License](https://img.shields.io/badge/License-Open%20Source-blue?style=flat-square)](#) +[![Jekyll](https://img.shields.io/badge/Built%20with-Jekyll-red?style=flat-square)](https://jekyllrb.com/) +[![Contact](https://img.shields.io/badge/Email-ml4--sci%40cern.ch-orange?style=flat-square)](mailto:ml4-sci@cern.ch) -## ML4SCI in GSoC 2022 +**Applying cutting-edge machine learning to the world's most challenging scientific problems.** -The ML4Sci open source organization is participating in the [2022 Google Summer of Code](https://summerofcode.withgoogle.com/). If you are a student interested in our [projects](https://ml4sci.org/activities/gsoc.html) please check our [ideas page](https://ml4sci.org/gsoc/2022/summary.html). ML4Sci is an umbrella organization that welcomes other projects and organizations related to machine-learning for science. Please contact the admins at [ml4-sci@cern.ch](ml4-sci@cern.ch) if you are interested in participating as a project. -![GSOC](https://ml4sci.org/images/GSoC/GSoC-icon-192.png) +[🌐 Live Site](https://ml4sci.org) Β· [πŸ“‹ GSoC Projects](https://ml4sci.org/gsoc/2026/summary.html) Β· [πŸ“ Apply Now](#how-to-apply) Β· [πŸ’¬ Contact Us](mailto:ml4-sci@cern.ch) -Please take a look at our [GSoC Page](https://ml4sci.org/activities/gsoc.html) for more details. +
+ +--- + +## πŸ“– Table of Contents + +- [About ML4SCI](#-about-ml4sci) +- [GSoC Program Overview](#-gsoc-program-overview) +- [2026 Research Projects](#-2026-research-projects) +- [Participating Organizations](#-participating-organizations) +- [How to Apply](#-how-to-apply) +- [For Mentors](#-for-mentors) +- [GSoC Timeline](#-gsoc-timeline) +- [Student Blogs & Publications](#-student-blogs--publications) +- [Repository Structure](#-repository-structure) +- [Contributing to This Website](#-contributing-to-this-website) +- [Program History](#-program-history) +- [Administrators & Contact](#-administrators--contact) + +--- + +## πŸ”¬ About ML4SCI + +**Machine Learning for Science (ML4SCI)** is an open-source umbrella organization that bridges the gap between modern machine learning techniques and cutting-edge scientific research. We bring together researchers, engineers, and students from top universities and laboratories worldwide to tackle the hardest unsolved problems in STEM. + +### What We Do + +ML4SCI participants develop and apply state-of-the-art ML methodsβ€”from deep learning and quantum computing to symbolic regression and graph neural networksβ€”to real scientific challenges including: + +| Domain | Applications | +|--------|-------------| +| βš›οΈ High-Energy Physics | Particle identification, detector simulation, event reconstruction | +| πŸ”­ Astrophysics | Gravitational lensing, dark matter detection, exoplanet discovery | +| 🌌 Cosmology | Strong lensing classification, galaxy evolution, cosmic ray analysis | +| πŸͺ Planetary Science | Surface composition mapping, albedo modeling, crater detection | +| πŸ”¬ Quantum Science | Quantum ML algorithms, NMR spin decoding, variational circuits | +| 🧬 Biophysics | Neural signal processing, spectroscopy, medical imaging | + +ML4SCI participants are mentored by world-class scientists at institutions including CERN, NASA, MIT, Brown University, University of Alabama, and many more. + +--- + +## 🎯 GSoC Program Overview + +ML4SCI has participated in **Google Summer of Code** since 2021, providing students with the opportunity to contribute 175 hours of open-source ML research over a structured coding period. + +### Why ML4SCI GSoC? + +- πŸ† **Real scientific impact** β€” your code runs on actual experimental data +- πŸ§‘β€πŸ”¬ **Expert mentorship** β€” work 1:1 with PhDs and research scientists +- 🌍 **Global community** β€” collaborate with researchers across 30+ institutions +- πŸ“„ **Publication opportunities** β€” many projects lead to co-authored papers +- πŸ’‘ **Cutting-edge methods** β€” quantum ML, graph neural networks, transformers, diffusion models, and more +- πŸŽ“ **Skill development** β€” gain experience with real HPC, CERN datasets, and scientific Python ecosystems + +### GSoC Participation History + +| Year | Students | Projects | Organizations | +|------|----------|----------|---------------| +| 2021 | ~20 | 13 | 20+ | +| 2022 | ~25 | 11 | 25+ | +| 2023 | ~30 | 13 | 30+ | +| 2024 | ~35 | 14 | 40+ | +| 2025 | ~40 | 15 | 40+ | +| 2026 | TBD | 16 | 40+ | + +--- + +## πŸš€ 2026 Research Projects + +Below is an overview of the active research projects for GSoC 2026. Each project has multiple proposal topics β€” visit the [full project list](https://ml4sci.org/gsoc/2026/summary.html) for detailed descriptions, mentor contact, and evaluation tests. + +| Project | Area | Key ML Methods | +|---------|------|----------------| +| **[CMS](https://ml4sci.org/gsoc/projects/project_CMS.html)** | Particle Physics | End-to-end deep learning, compression | +| **[DeepLense](https://ml4sci.org/gsoc/projects/project_DEEPLENSE.html)** | Dark Matter / Lensing | CNNs, diffusion models, self-supervised learning | +| **[E2E](https://ml4sci.org/gsoc/projects/project_E2E.html)** | Particle Reconstruction | Transformers, masked autoencoders, GNNs | +| **[EXXA](https://ml4sci.org/gsoc/projects/project_EXXA.html)** | Exoplanets | Equivariant networks, foundation models | +| **[DeepFALCON](https://ml4sci.org/gsoc/projects/project_FALCON.html)** | Fast Simulation | Generative models, GANs | +| **[FASEROH](https://ml4sci.org/gsoc/projects/project_FASEROH.html)** | Symbolic Methods | Seq2seq, histogram regression | +| **[GENIE](https://ml4sci.org/gsoc/projects/project_GENIE.html)** | Anomaly Detection | Contrastive learning, generative models | +| **[LOX](https://ml4sci.org/gsoc/projects/project_LOX.html)** | Nuclear Astrophysics | Classification, spectral analysis | +| **[Lunar Prospector](https://ml4sci.org/gsoc/projects/project_LUNARPROSPECTOR.html)** | Planetary Science | Image analysis, regression | +| **[MESSENGER](https://ml4sci.org/gsoc/projects/project_MESSENGER.html)** | Planetary Science | Surface composition, spectroscopy | +| **[ML4DQM](https://ml4sci.org/gsoc/projects/project_ML4DQM.html)** | Detector QA | Anomaly detection, monitoring | +| **[NeuroDyad](https://ml4sci.org/gsoc/projects/project_NEURODYAD.html)** | Neuroscience | Neural signal processing | +| **[PREDICT](https://ml4sci.org/gsoc/projects/project_PREDICT.html)** | Multi-domain | Predictive modeling | +| **[QMLHEP](https://ml4sci.org/gsoc/projects/project_QMLHEP.html)** | Quantum ML | QNNs, variational circuits, QGAN | +| **[SYMBA](https://ml4sci.org/gsoc/projects/project_SYMBA.html)** | Symbolic AI | Transformer-based symbolic regression | +| **[SYMMETRY](https://ml4sci.org/gsoc/projects/project_SYMMETRY.html)** | Equivariant Models | Lie groups, symmetry discovery | + +> πŸ“Œ **New for 2026:** Check back on **February 19th, 2026** for the full list of accepted proposals once the GSoC organization list is announced by Google. + +--- + +## πŸ›οΈ Participating Organizations + +ML4SCI is proudly supported by researchers from institutions across the globe: + +
+πŸ‡ΊπŸ‡Έ United States (click to expand) + +- University of Alabama +- Brown University +- Carnegie Mellon University +- Cornell University +- Dartmouth College +- Davidson College +- University of Florida / Florida State University +- University of Georgia +- Johns Hopkins University Applied Physics Laboratory (JHUAPL) +- University of Kansas +- University of Kentucky +- Keck Graduate Institute +- Los Alamos National Laboratory (LANL) +- MathWorks +- Massachusetts Institute of Technology (MIT) +- NASA Goddard Space Flight Center +- New York University +- University of South Carolina +- University of Washington +- University of Wisconsin–Madison + +
+ +
+🌍 Europe & Middle East (click to expand) + +- CERN (Switzerland) +- EPFL (Switzerland) +- University of Erlangen–Nuremberg (Germany) +- University of Leeds (UK) +- Middle East Technical University (Turkey) +- NTUA – National Technical University of Athens (Greece) +- Institut Polytechnique de Paris (France) +- Polytechnic University of Catalonia (Spain) +- Princess Sumaya University for Technology (Jordan) +- Qassim University (Saudi Arabia) +- RWTH Aachen University (Germany) +- Technical University of Munich (Germany) +- American University of Beirut (Lebanon) + +
+ +
+🌏 Asia & Rest of World (click to expand) + +- IIT Dhanbad +- BITS Pilani – Goa & Hyderabad Campuses (India) +- NISER (India) +- VIT – Vishwakarma Institute of Technology (India) + +
+ +--- + +## πŸ“ How to Apply + +> **⚠️ The 2026 GSoC student application period has not yet begun.** Check back February 19th, 2026 for updates. + +### Application Process (General) + +``` +1. Browse Project Ideas β†’ 2. Contact Mentors β†’ 3. Complete Evaluation Test + ↓ ↓ +4. Write Your Proposal ←←←←←←←←← Feedback ←←←←←←←←←←←← + ↓ +5. Submit via GSoC Portal by deadline +``` + +### Step-by-Step Guide + +1. **Explore the ideas page** β€” visit [ml4sci.org/gsoc/2026/summary.html](https://ml4sci.org/gsoc/2026/summary.html) and read through available projects +2. **Reach out to mentors** β€” introduce yourself early and ask questions about the project scope +3. **Complete the evaluation test** β€” each project has a specific coding/analysis test to demonstrate readiness +4. **Write a strong proposal** β€” your proposal should include: + - Project understanding and motivation + - Technical approach and methodology + - Timeline with milestones + - Relevant background and experience +5. **Submit on time** β€” proposals must be submitted through the [official GSoC portal](https://summerofcode.withgoogle.com/) + +### Tips for a Strong Application + +- βœ… Start early β€” reach out to mentors at least 4–6 weeks before the deadline +- βœ… Be specific β€” clearly describe your technical approach, not just goals +- βœ… Show your work β€” include relevant code samples, GitHub links, or prior projects +- βœ… Demonstrate domain knowledge β€” show you understand the science behind the project +- βœ… Ask good questions β€” thoughtful questions demonstrate engagement with the problem +- ❌ Don't send generic proposals to multiple mentors simultaneously without customization + +> πŸ“š See the full [student guidelines](https://ml4sci.org/gsoc/students-guideline.html) for more details. + +--- + +## πŸ§‘β€πŸ« For Mentors + +Interested in mentoring a GSoC student under the ML4SCI umbrella? + +### Requirements + +- Your project must have a **clear connection to machine learning applied to science** +- You must be able to commit to mentoring (~5 hrs/week) during the coding period +- You need to prepare an **evaluation test** for candidates + +### How to Join + +1. Contact the admins at [ml4-sci@cern.ch](mailto:ml4-sci@cern.ch) before the organization application deadline +2. Prepare your project description following the [proposal example](https://ml4sci.org/gsoc/2026/summary.html) +3. Create an organization file in `_gsocorgs/2026/` following the [README instructions](_gsocorgs/2026/README.md) +4. Add your proposal to `_gsocproposals/2026/` + +> πŸ“„ See [gsoc/guideline.md](gsoc/guideline.md) for full mentor guidelines. + +--- + +## πŸ“… GSoC Timeline + +### 2026 Key Dates + +| Date | Event | +|------|-------| +| **Feb 19, 2026** | GSoC organization list announced | +| **Feb 19 – Early Apr 2026** | Students interact with mentors, complete evaluation tests | +| **Early April 2026** | Student proposal submission deadline | +| **Late April 2026** | Slot requests submitted by org admins | +| **Mid May 2026** | Accepted student projects announced | +| **Mid May – June 2026** | Community bonding period | +| **June – September 2026** | Coding period begins | + +> πŸ—“οΈ Always check the [official GSoC timeline](https://summerofcode.withgoogle.com/how-it-works/) for exact dates. + +--- + +## πŸ“š Student Blogs & Publications + +ML4SCI students regularly publish about their work. Selected highlights: + +### Recent Blog Posts (2025) + +| Student | Project | Blog | +|---------|---------|------| +| Sijil Jose | Physics-Informed Neural Networks | [Read β†’](https://medium.com/@sijiljose.999/gsoc-2025-with-ml4sci-part-i-physics-informed-neural-network-for-diffusion-equation-pinnde-491d46a5b84d) | +| Arnav Singhal | Q-MAML for Variational Quantum Algorithms | [Read β†’](https://medium.com/@arnavsinghal06/gsoc-25-q-maml-for-variational-quantum-algorithms-for-high-energy-physics-analysis-at-lhc-7b85a54a8924) | +| Aaditya Porwal | Foundation Models for CMS | [Read β†’](https://medium.com/@aadityaporwal234/foundation-models-for-end-to-end-event-reconstruction-for-the-cms-experiment-08f2e1a45487) | +| Hamees Sayed | Gravitational Lensing with Flow Matching | [Read β†’](https://medium.com/@hameessayed71/simulating-gravitational-lensing-with-flow-matching-gsoc-midterm-update-7d4692375ae8) | + +> πŸ“– View all student blog posts: [ml4sci.org/activities/studentblogs.html](https://ml4sci.org/activities/studentblogs.html) + +### Scientific Publications + +Research from ML4SCI has produced peer-reviewed publications including: + +- *"End-to-end jet classification of boosted top quarks with CMS open data"* β€” Physical Review D, 2022 +- *"SYMBA: Symbolic Computation of Squared Amplitudes with Machine Learning"* β€” ML: Science and Technology, 2022 +- *"Deep learning the morphology of dark matter substructure"* β€” The Astrophysical Journal, 2020 +- *"Locating Hidden Exoplanets in ALMA Data Using Machine Learning"* β€” The Astrophysical Journal, 2022 + +> πŸ“„ Full publication list: [ml4sci.org/activities/papers.html](https://ml4sci.org/activities/papers.html) + +--- + +## πŸ—‚οΈ Repository Structure + +This repository powers [ml4sci.org](https://ml4sci.org) using **Jekyll** and **GitHub Pages**. + +``` +ml4sci.github.io/ +β”‚ +β”œβ”€β”€ πŸ“„ index.html # Homepage +β”œβ”€β”€ βš™οΈ _config.yml # Jekyll site configuration +β”œβ”€β”€ πŸ“¦ Gemfile # Ruby dependencies +β”‚ +β”œβ”€β”€ πŸ“ _activities/ # GSoC & Hackathon activity pages +β”‚ β”œβ”€β”€ gsoc2021.md ... gsoc2026.md # Per-year GSoC overview pages +β”‚ β”œβ”€β”€ hackathon2020.md # ML4SCI Hackathon event pages +β”‚ β”œβ”€β”€ papers.md # Student publications +β”‚ └── studentblogs.md # GSoC student blog collection +β”‚ +β”œβ”€β”€ πŸ“ _gsocproposals/ # GSoC project proposals (by year) +β”‚ β”œβ”€β”€ 2021/ ... 2026/ # Proposal markdown files +β”‚ └── archived/ # Retired proposals +β”‚ +β”œβ”€β”€ πŸ“ _gsocprojects/ # Project descriptions (by year) +β”‚ β”œβ”€β”€ 2021/ ... 2026/ # project_[NAME].md files +β”‚ └── archived/ # Retired project descriptions +β”‚ +β”œβ”€β”€ πŸ“ _gsocorgs/ # Participating organization profiles +β”‚ β”œβ”€β”€ 2020/ ... 2026/ # Organization markdown files +β”‚ └── [year]/README.md # Instructions for adding orgs +β”‚ +β”œβ”€β”€ πŸ“ gsoc/ # GSoC summary & mentor pages +β”‚ β”œβ”€β”€ guideline.md # Mentor guidelines +β”‚ β”œβ”€β”€ students-guideline.md # Student application guide +β”‚ └── [year]/ +β”‚ β”œβ”€β”€ summary.md # Full proposal index +β”‚ └── mentors.md # Mentor listing +β”‚ +β”œβ”€β”€ πŸ“ _layouts/ # Page layout templates +β”œβ”€β”€ πŸ“ _includes/ # Reusable HTML/Liquid components +β”œβ”€β”€ πŸ“ _profiles/ # Contributor profile cards +β”œβ”€β”€ πŸ“ _workinggroups/ # Working group pages +β”œβ”€β”€ πŸ“ _data/ # YAML data files +β”‚ └── training-schools.yml # Training events data +β”‚ +β”œβ”€β”€ πŸ“ images/ # Site images and logos +β”œβ”€β”€ πŸ“ css/ # Custom stylesheets +β”œβ”€β”€ πŸ“ organization/ # Org governance docs +β”œβ”€β”€ πŸ“ scripts/ # Utility scripts +β”‚ β”œβ”€β”€ add_training_event.py # Helper for adding training events +β”‚ └── profile_maintenance_script.py +β”‚ +└── πŸ“ .github/workflows/ # CI/CD pipelines + β”œβ”€β”€ build.yaml # Jekyll build check + β”œβ”€β”€ stale.yaml # Stale issue management + └── test.yaml # Link validation tests +``` + +### Key Content Files + +| File | Purpose | +|------|---------| +| `_gsocproposals/[year]/proposal_*.md` | Individual project proposals β€” **add new proposals here** | +| `_gsocprojects/[year]/project_*.md` | Project overview descriptions | +| `_gsocorgs/[year]/[org].md` | Organization profile β€” **add your institution here** | +| `gsoc/[year]/summary.md` | Auto-generated ideas page index | +| `gsoc/[year]/mentors.md` | Mentor listing | + +--- + +## 🀝 Contributing to This Website + +We welcome contributions to improve the site, add new proposals, fix errors, or update information. + +### Prerequisites + +- [Ruby](https://www.ruby-lang.org/en/downloads/) and [RubyGems](https://rubygems.org/pages/download) +- [Bundler](https://bundler.io/): `gem install bundler` +- Git + a GitHub account + +### Local Development Setup + +```bash +# 1. Fork and clone the repository +git clone https://github.com//ml4sci.github.io.git +cd ml4sci.github.io + +# 2. Install dependencies +bundle install + +# 3. Run the local server +bundle exec jekyll serve + +# 4. Visit http://localhost:4000 in your browser +``` + +### Adding a New GSoC Proposal + +```bash +# 1. Create a new branch +git checkout -b add-proposal-myproject + +# 2. Add your organization file (if new) +cp _gsocorgs/2026/cern.md _gsocorgs/2026/myorg.md +# Edit with your organization details + +# 3. Add a project description (if new) +cp _gsocprojects/2025/project_CMS.md _gsocprojects/2026/project_MYPROJECT.md +# Edit with your project details + +# 4. Add the proposal file +cp _gsocproposals/2025/proposal_CMS1.md _gsocproposals/2026/proposal_MYPROJECT1.md +# Edit with your full proposal content + +# 5. Commit and open a pull request +git add . +git commit -m "Add GSoC 2026 proposal: [Project Name]" +git push origin add-proposal-myproject +``` + +### File Naming Conventions + +| Type | Convention | Example | +|------|-----------|---------| +| Organization | lowercase, no spaces | `university_of_alabama.md` | +| Project | `project_UPPERCASE.md` | `project_DEEPLENSE.md` | +| Proposal | `proposal_UPPERCASE#.md` | `proposal_DEEPLENSE1.md` | +| Year directory | `YYYY/` | `2026/` | + +### Pull Request Guidelines + +- Keep PRs focused on a single change +- Write a clear description of what was added/changed +- Ensure the site builds locally before submitting +- Check for broken links + +> πŸ“– New to Git/GitHub? See our [GitHub beginners guide](github-beginners.md). + +--- + +## πŸ“Š Program History + +ML4SCI has grown significantly since its first GSoC participation: + +``` +2020 ML4SCI Hackathon launched (University of Alabama) + β”‚ +2021 First GSoC participation as umbrella org + β”‚ ↳ ~20 students, 13 projects, 20+ institutions + β”‚ +2022 GSoC expanded to 11 projects, added quantum ML track + β”‚ ↳ Symbolic regression (SYMBA), quantum GANs introduced + β”‚ +2023 Grew to 30+ mentors from 30+ institutions globally + β”‚ ↳ First Goddard/NASA projects, GENIE anomaly detection added + β”‚ +2024 40+ institutions, 14 projects, first MathWorks partnership + β”‚ ↳ ML4DQM data quality monitoring added + β”‚ +2025 15 projects, 40+ students, SYMMETRY & NeuroDyad introduced + β”‚ ↳ Multiple students publish first-author papers + β”‚ +2026 PREDICT project added, Kettering Health joins for medical ML +``` + +### Highlights + +- πŸ… **Multiple peer-reviewed publications** from GSoC student work +- 🌍 **30+ participating institutions** across 4 continents +- πŸŽ“ **~150 students** trained in ML for science since 2021 +- βš›οΈ **Active collaborations** with CERN, NASA, and national laboratories + +--- + +## πŸ‘₯ Administrators & Contact + +### GSoC Program Administrators + +| Administrator | Institution | +|--------------|-------------| +| [Prof. Sergei Gleyzer](https://sergeigleyzer.com/) | University of Alabama | +| [Prof. Emanuele Usai](https://emanueleusai.com) | University of Alabama | +| [Dr. Patrick Peplowski](https://civspace.jhuapl.edu/people/patrick-peplowski) | Johns Hopkins APL | + +### Contact Us + +| Purpose | Contact | +|---------|---------| +| πŸ“§ **General GSoC inquiries** | [ml4-sci@cern.ch](mailto:ml4-sci@cern.ch) | +| 🌐 **Website** | [ml4sci.org](https://ml4sci.org) | +| πŸ’» **GitHub** | [github.com/ML4SCI](https://github.com/ML4SCI) | +| πŸ“… **Events Calendar** | [Community Calendar](https://ml4sci.org/future-events.html) | + +--- + +## πŸ“œ License + +This website and its content are open source. The site is built with [Jekyll](https://jekyllrb.com/) and hosted on [GitHub Pages](https://pages.github.com/). + +Scientific code produced by GSoC students is released under open-source licenses as specified by each individual project. + +--- + +
+ +**Made with ❀️ by the ML4SCI community** + +*Bridging machine learning and science, one project at a time.* + +[![GSoC 2026](https://img.shields.io/badge/GSoC-2026-4285F4?style=for-the-badge&logo=google)](https://summerofcode.withgoogle.com/) +[![ML4SCI](https://img.shields.io/badge/ML4SCI-ml4sci.org-green?style=for-the-badge)](https://ml4sci.org) + +
\ No newline at end of file