This page aggregates project ideas for Google Summer of Code 2021. Each of the projects listed below have one or more mentors associated. Please contact the mentors for more information about these projects. If you have generic questions about NCSA and GSOC please contact:
This project will create a recommendation system when visualizing galaxy images and make recommendations to the user on similar galaxies. The main idea is to run a Convolutional Neural Network (CNN) and/or autoencoders to rank galaxies from a large DB by their similarities to a query image. There is already work done for this projects and mentor will provide all the necessary images and tools to be used. Ideally this will end up in a web application itself or as a addition to DESaccess
Requirements
front-end - Polymer, HTML, Javascript, Python, Tornado
back-end - Python, Tensorflow, Scikit-learn, image manipulation, Machine learning
Deliverable
working plugin and recommendation system
Mentors
Links
NEATis a next gen sequence simulator widely used by the genomics community that currently only runs single threaded. Speeding up this program through multiprocessing will allow greater access and speed gains
Requirements
python
Deliverable
Multiprocessing for next-gen sequence simulator
Mentors
Links
The NEAT project continues to gain momentum with the recently release Python 3 version. By incorporating continuous integration, we can improve NEAT while maintaining its consistency.
Requirements
Knowledge of CI platform such as Travis, Jenkins or similar. python.
Deliverable
NEAT has continuous integration for future development
Mentors
Links
A key utility in processing genetic datasets in VCF format is the VCF_compare tool, a part of the Next-Gen Analysis Toolkit (NEAT). This tool, while useful, is outdated and needs to run smoothly and efficiently on modern high-speed processors.
Requirements
python
Deliverable
A functioning python3 VCF comparison tool with a faster walltime than the original
Mentors
Links
Currently the DataWolf scientific workflow system has a web frontend for creating and executing workflows that is written in backbone-js. The goal is to begin modernizing the frontend with React with a focus on setting up existing workflows for execution through the web interface.
Requirements
Knowledge of React
Deliverable
React web app that can communicate with a DataWolf server and execute existing workflows.
Mentors
Links
Currently the CoverCrop Analyzer web service is written using Python Flask-RESTful. It needs to be improved by writing an OpenAPI 3.0 specification that includes authentication, and using packages like Python Connexion to do automatic validation. The main goals of this project are to generate API documentation, which can be rendered using Swagger UI and simplify the service code in the process, and update the Dockerfile accordingly.
Requirements
Knowledge of Python, OpenAPI, and Connexion (optional).
Deliverable
Improved CoverCrop Analyzer web service code that uses OpenAPI and Python Connexion and API documentation.
Links
One of the most significant challenges when using high performance computing (HPC) systems is that of customizing software to match heterogeneous computing environments. In this project you will develop a user-focused tool for evaluating the configuration of the Parsl parallel programming library (parsl-project.org) for a specific target system. We will provide access to large scale HPC clusters and work with the student to explore troublesome configuration scenarios.
Requirements
Python
Deliverable
A Python tool for evaluating Parsl configuration on a HPC cluster
Mentors
Links
In this project we are looking for students to build out the capabilities of the system by simplifying the task of training deep learning models stored within the system as well as executing existing models developed with Tensorflow, Keras and Pytorch.
Requirements
Scala, Python, or Javascript
Deliverable
Modifications to Clowder’s core and supporting libraries (pyclowder) to simplify training and running deep learning models with Tensorflow, Keras and Pytorch.
Mentors
Links
Extend the build-in geospatial capabilities of the system by adding support for geolocating datasets and extending the metadata search capabilities to geospatial queries.
Requirements
Scala, Python, or Javascript
Deliverable
Modifications to Clowder’s core to support geospatial queries and visualizations using MongoDB or Postgis databases.
Mentors
Links