Summer Research Projects
Prof. Constantine Sideris
1. Analog/RF Integrated Circuits for Biomedical and Wireless Applications: We are working on several projects related to wearable devices, ingestible devices (“smart pills”), and implantable devices for real-time health monitoring, disease detection and prevention. Interested students should have analog circuit design experience and preferably experience using the Cadence Virtuoso design tools. Experience using electromagnetic simulation tools such as Ansys HFSS or CST is also a plus.
2. Computational Electromagnetics: We are working on several projects related to automating the design of high-performance radio-frequency (RF) and nanophotonic devices, such as antennas, waveguide couplers, grating couplers, tapers, and nanophotonic switches. This involves developing our own solvers for Maxwell’s equations (e.g., integral equation methods, finite difference, and finite element methods) and coupling them with gradient-based topology optimization algorithms to inverse design such devices. We are also interested in exploring machine learning algorithms to accelerate modeling electromagnetic devices and improve optimization of new deviecs. Interested students should have C/C++ programming experience and be comfortable with advanced electromagnetics concepts.
Prof. Muhao Chen:
1. Knowledge Acquisition with Indirect Supervision: Knowledge acquisition (e.g., relation extraction, entity and event typing) faces challenges including extreme label spaces, few-shot/zero-shot predictions and out-of-domain prediction. To this end, we study methods for leveraging indirect supervision signals from auxiliary tasks (e.g., natural language inference, text summarization, etx.) to foster robust and generalizable inference for knowledge acquisition. In the same context, we study methods for generating semantically rich label representations based on either gloss knowledge or structural knowledge from a well-populated lexical knowledge base, in order to better support learning with limited labels.
2. Event-Centric Natural Language Processing. Human languages evolve to communicate about events happening in the real world. Therefore, understanding events plays a critical role in natural language understanding (NLU). A key challenge to this mission lies in the fact that events are not just simple, standalone predicates. Rather, they are often described at different granularities, temporally form event processes, and are directed by specific central goals in a context. Our research in this line helps the machine understand events described in natural language. This includes the understanding of how events are connected, form processes or structure complices, and the recognition of typical properties of events (e.g., space, time, salience, essentiality, implicitness, memberships, etc.).
3. Robust Information Extraction from Numan Language Text.
Knowledge graphs (KGs) provide both open-world and domain-specific knowledge representations that are integral to many AI systems. However, constructing KGs is usually very costly and requires extensive effort. A widely attempted solution is to learn knowledge acquisition models that automatically induce structured knowledge from unstructured text. However, such models developed through data-driven machine learning are usually fragile to noise in learning resources, and may fall short of providing reliable inference on large, heterogeneous real-world data. We are developping a general meta-learning framework that seeks to systematically improve the robustness of learning and inference for data-driven knowledge acquisition models. We seek to solve several key problems to accomplish the goal: (i) How to identify incorrect training labels and prevents overfitting on noisy labels; (ii) how do detect invalid input instances in inference (e.g., out-of-distribution ones) and provide abstention-awareness; (iii) automated constraint learning that strengthens model inference with global consistency; (iv) how to automatically augment training signals of the knowledge acquistion model or the backbone language model.
4. Machine Commonsense Reasoning with Minimal Supervision. Various types of commonsense inference tasks are challenging the SOTA language models. Such tasks may include inferring preconditions of facts, typical properties of entities and events (e.g. time, scales and numerical properties), and typical relations (e.g. ordering and membership of events, topological relations of entities). While annotating data for those aspects of commonsense inference can be costly, we seek to minimally leverage any expensive annotations, but instead develop linguistic pattern mining techniques to find vast cheap (though allowably noisy) supervision data from the Web, and lead that towards a scalable and generalizable solution to improve commonsense inference based on distant supervision.
Prof. Shaama Mallikarjun Sharada:
Area of Research: Machine learning methods for catalyst design (One student will be hosted)
Prof. Jose-Luis Ambite
Areas of Research: Secure Federated Learning in Biomedical Domains
There are situations where data relevant to machine learning problems are distributed across multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning (FL) is a promising approach to learn a joint model (a neural network) over all the available data across silos. We are investigating distributed training policies that improve convergence and performance, specially when the sites participating in a federation have different data distributions and computational capabilities.
We are interested in medical applications, such as a federation of hospitals or a distributed research study that seeks to predict a disease based on multimodal (imaging, EHR) data.
Dimitris Stripelis, Jos e Luis Ambite, Pradeep Lam, Paul M. Thompson. Scaling Neuroscience Research using Federated Learning. 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.
Prof. Meida Chen
Our project aims to provide a large database of annotated ground truth point clouds reconstructed using aerial photogrammetry for training and validating 3D semantic and instance segmentation algorithms. And we are developing a synthetic data generation pipeline to create synthetic training data that can augment or even replace real-world training data.
Prof. Krishna Nayak
The work involves experiments and data collection, so virtual mode is not appropriate. Two students in-person in Summer 2022 are preferred.
1. Image Reconstruction for Real-time MRI
2. Improved Pediatric and Fetal MRI at 0.55 Tesla
Prof. Hossein Hashemi
1. Silicon Photonics Integrated Circuits: This project involves analysis, design, optimization, and simulation of various photonic devices such as waveguides, couplers, resonators, and gratings using a commercial foundry silicon photonics process for applications in lidar, imaging, display, free-space optical communications, optical signal processing, and biomedical sensors.
2. Millimeter-Wave Integrated Circuits: This topic involves analysis, design, optimization, and simulation of circuits that operate in the millimeter-wave frequency range (30 GHz – 300 GHz) for applications in wireless communications, radar, and imaging.
Prof. Andrei Irimia
My research is at the intersection of image analysis, machine learning, neuroscience, and gerontology. I am interested in predictive models of Alzheimer's disease (AD), cognitive impairment (CI) and accelerated aging, and in leveraging ML, neural networks, neurogenetics and MRI/CT to identify risk factors for AD and CI, as well as for other conditions leading to neurodegeneration, including traumatic brain injury (TBI). Combining such models based on MRI/CT with polygenic risk scores can help clinicians to prioritize monitoring and personalized strategies to reduce disease risk and delay disease onset.
Prof. Joshua Yang
Research Area: Post-CMOS materials and devices to enable non von Neumann hardware, architecture and algorithms.
1. Neuromorphic / Synaptic computing using memristive devices with diffusion dynamics to implement neuroscience principles;
2. Hardware accelerators to efficiently implement Artificial Intelligence and Machine Learning using analog resistive switching devices;
3. High performance Non-volatile memories using emerging materials and devices.
Prof. Feifei Qian
Lab: Robot Locomotion And Navigation Dynamics (RoboLAND) Lab
Research Areas: Bio-inspired Robotics, Legged Locomotion, Robophysics.
1. Obstacle-aided locomotion and navigation: This project explores how robots can exploit different features of
their physical environments to achieve desired movements. Can multi-legged robots and snake-like robots intelligently collide with obstacles on purpose to robustly move towards desired directions? Can a robot effectively turn itself by jamming the soft sand with its tail? In this project we will perform robot locomotion experiments to understand the complex interactions between robots and their environments, and use these interaction models to create novel strategies that can enable effective locomotion and navigation through challenging environments.
2. Understanding the world through every step: This project focuses on developing robots that can use their legs as soil or mud sensor to help geoscientists collect and interpret information at high spatial and temporal resolution. To achieve this, we will build robot legs that can sensitively “feel” the responses of desert sand or near-shore mud. We will design different interaction-based sensing protocols for the robot legs, and test these protocols in lab experiments. Once the sensing capabilities are developed and tested, we will take the robots to field trips, where the robots work alongside human scientists and learn how human make sampling decisions and adapt exploration strategies based on dynamic incoming measurements. Going forward, these understandings will help enable our robots with cognitive “reasoning” capabilities to flexibly support human teammates’ scientific objectives during collaborative exploration missions.
Prof. Sven Koenig
The IDM artificial intelligence lab (idm-lab.org) is looking for students to help us develop the next generation of search algorithms in the context of the kind of multi-agent path finding that is important for automated warehouses (www.youtube.com/watch?v=6KRjuuEVEZs) and railway scheduling (https://www.aicrowd.com/challenges/neurips-2020-flatland-challenge/)
In general, consider several agents (such as robots, trains, or game characters) that need to move from their current vertices on a graph with blocked and unblocked vertices to given goal vertices without obstructing each others' movements. This problem requires path planning but, different from single-agent path planning, is NP-hard and thus requires extremely smart algorithms to result in good performance. Our project is an artificial intelligence rather than a robotics project. It uses simple simulations instead of robot hardware. Please find out more about our research on this topic at http://idm-lab.org/project-p.html.
Prof. Emilio Ferrara
Most of the world population is connected to the global information environment. Therefore, it is of paramount importance to understand how information spreads and being able to capture, at scale and in real time, the mechanisms that govern the diffusion of online information in an increasingly-interconnected world. Using machine learning models, and causal inference methods, we will study online social phenomena at different temporal resolutions and at multiple scales, from individual to community to global collective behavior. Special attention will be given to dynamics of manipulation and the use of artificial intelligence agents like bots online.
Prof. Ananya Renuka Balakrishna
Research Area: Computational Materials Science, Solid Mechanics, Phase transformation
Outline: Functional materials are characterized by their native properties and functions of their own, such as ferroelectricity, magnetism, and light-interactive properties. These materials show a large response to small stimuli and are used in sensors, actuators, memory devices, and energy harvesters. In this project, students will use previously developed multi-physics models from our group to computationally engineer material microstructures that can drastically enhance material performance.
Prof. Justin Haldar
Magnetic resonance (MR) imaging technologies provide unique capabilities to probe the mysteries of biological systems, and have enabled novel insights into anatomy, metabolism, and physiology in both health and disease. However, while MR imaging is decades old, is associated with multiple Nobel prizes (in physics, chemistry, and medicine), and has already revolutionized fields like medicine and neuroscience, current methods are still very far from achieving the full potential of the MR signal. Specifically, modern MR image methods suffer due to long data acquisition times, limited signal-to-noise ratio, and various other practical and experimental factors — this limits the amount of information we can extract from living human subjects, and often precludes the use of advanced experimental methods that could otherwise increase our understanding by orders-of-magnitude. Our research group addresses such limitations from a signal processing perspective, developing novel methods for data acquisition, image reconstruction, and parameter estimation that combine: (1) the modeling and manipulation of physical imaging processes; (2) the use of novel constrained signal and image models; (3) novel theory to characterize signal estimation frameworks; and (4) fast computational algorithms and hardware. Our methods are often based on jointly designing data acquisition and image reconstruction methods to exploit the inherent structure that can be found within high-dimensional data, and we do our best to take full advantage of the “blessings of dimensionality” while mitigating the associated “curses.” We are seeking excellent students with a strong background in signal processing, with an interest in developing methods to improve existing advanced MR methods and an interest in enabling/exploring innovative next generation imaging approaches.
Prof. Viktor Prasanna
Areas of Research:
1. FPGA acceleration
2. ML and AI acceleration
3. Streaming data science
4. Parallel Computation
Prof. Jonathan Gratch
Areas of Research: Affective Computing, Natural Language Processing, Psychology
Automated negotiation systems in natural language: The Affective Computing Group at USC’s Institute for Creative Technologies (ICT) is seeking an intern to assist our research in automated negotiation systems that exhibit appropriate emotional and strategical awareness, while utilizing realistic modes of communication such as natural language. Interns will be involved in designing a practically-inspired dialogue systems that engages in effective negotiations with humans. This is a highly interdisciplinary project. Such negotiation systems find a number of practical applications from pedagogy to conversational AI. The precise problem can be decided by incorporating the interests of the interns but potential problems include: 1) manipulating the emotional or strategical behaviors of the negotiation system and conducting crowdsourcing experiments to understand how these manipulations impact negotiation outcomes, 2) using insights from prior affective computing or psychological studies to build neural text generation models for negotiation dialogue systems, 3) improving response quality in terms of consistency and coherency, 4) opponent modelling approaches based on chat-based conversations.
Our summer interns frequently publish their work at a wide variety of venues. Here are a couple of successful intern projects published at NAACL 2021 (https://aclanthology.org/2021.naacl-main.254.pdf) and ACII 2021 (https://arxiv.org/pdf/2107.13165.pdf). We expect future work to build up on this progress.
Recommended Skills: Strong coding skills (Python preferred) and prior experience/interest in ML, NLP, Affective Computing techniques.
Prof. Mengjie Yu
1. Nanofabrication of integrated photonic circuits based on silicon, silicon nitride and lithium niobate
2. Photonic structure design and optimization for frequency conversion
3. Visible to telecommunication conversion for quantum interconnection
4. Characterization of high quality microring resonators
Areas of Research: nanophotonics, nonlinear optics, optoelectronics, optical frequency combs
Prof. Jesse Thomason
Projects: Our lab works on Grounding Language in Actions, Multimodal Observations, and Robots (GLAMOR). We are working towards robots and virtual AI agents that can understand language requests like "Help me make breakfast" and ask questions when they are confused, such as "Where can I find the salt?" Our projects span from corpus-based machine learning, to learning AI agents in simulation environments, to learning and executing language instructions on physical robot platforms.
Areas of Research: Language Grounding, Language and Robotics (RoboNLP), Embodied AI
Prof. Chia Wei (Wade) Hsu
Areas of Research:Photonics in complex systems
1. Imaging in scattering media
2. Computational electromagnetics
3. Metasurface flat lens
4. Non-hermitian photonics
5. Wave propagation in disordered media and multi-mode fibers
Prof. Michelle Povinelli
1. Computation and logic using the flow of light and heat
2. Encrypted communications using light
Prof. Fan, Zhaoyang
The research area of my lab is the development and applications of novel magnetic resonance imaging techniques in human diseases.
Projects available for undergraduate students:
1) Deep-learning driven imaging acceleration for cardiovascular MR. We have over 70 cases with intracranial vessel wall images that were acquired in 12 minutes. With physics-informed neural networks, we wish to accelerate the acquisition from 12 min to 4 min per scan, and reduce the whole intracranial vessel wall imaging protocol from 18 minutes to 8 min or shorter.
2) Brain cancer MR imaging with spatial resolution enhancement. We have a 2-in-1 MR technique that provides perfusion and permeability quantification with low and high resolution. We would like to develop a self-learning technique to improved spatial resolution for better lesion characterization.
Prof. Dominique Duncan
1. EpiBioS4Rx: The Epilepsy Bioinformatics Study for Antiepileptogenic Therapy (EpiBioS4Rx) is a large, international, multicenter Center without Walls (CWOW) to address this pressing need by using studies of animals and patients with traumatic brain injury (TBI) leading to post-traumatic epilepsy (PTE) to develop the techniques and patient populations necessary to carry out future cost effective full-scale clinical trials of epilepsy prevention therapies; various analysis methods are applied to EEG, MRI (structural, functional, and diffusion), and blood data from both rodents and humans
2. DABI: The Data Archive for the BRAIN Initiative is a shared repository for invasive neurophysiology data from the NIH Brain Research Through Advancing Innovative Neurotechnologies (BRAIN) Initiative, including machine learning and other analytic tools applied to electrophysiology data
3. COVID-ARC: The COVID-19 Data Archive stores multimodal (i.e., demographic information, clinical-outcome reports, imaging scans) and longitudinal data related to COVID-19 and provides various statistical and analytic tools for researchers as well as machine learning applied to chest CT imaging data.
Prof. Pedro Szekely
1. Understanding event processes: Natural language always communicates about events, and events often connect into processes due to some central goal. Given the event process "fulfilling course requirements" -> "passing qualification exams" -> "publish papers" -> "doing internships" -> "defend dissertation", does a machine understand that it leads to the central goal of "earning a degree"? And how do we efficiently teach the machine to understand the salience of events, i.e. that "defending dissertation" is much more important than "doing internships"? Does such knowledge help downstream tasks narrative tasks (e.g. summarization, dialogue generation, story completion)?
2. Summarizing data in complex tables: Web tables contain rich information that may be displayed in massive and complex structures. To synthesize actionable knowledge from tables that would help downstream NLU tasks (e.g. QA and fact verification), a system needs to has the ability to summarize the salient information of tables into natural language claims. However, this is accompanied with several key challenges: (1) How to find salient information in tables of any layout structures? (2) Since different subparts of a table represent different knowledge/facts, how to foster controlled natural language generation based on various parts of the same table? (3) How do we enable effective aggregation of information in the generated summaries (e.g. finding the max, averages, or identifying specific patterns)? (4) How do we summarize in a way that helps downstream tasks for question answering and fact verification?
3. Distant supervision for commonsense inference: various types of commonsense inference tasks are challenging the SOTA language models. Such tasks may include inferring preconditions of facts, typical properties of entities and events (e.g. time, scales and numerical properties), and typical relations (e.g. ordering and membership of events, topological relations of entities). While annotating data for those aspects of commonsense inference can be costly, we seek to minimally leverage any expensive annotations, but instead develop linguistic pattern mining techniques to find vast cheap (though allowably noisy) supervision data from the Web, and lead that towards a scalable and generalizable solution to improve commonsense inference based on distant supervision.
4. Semantically rich label representations for open-domain information extraction: (open-domain) information extraction tasks (e.g., relation extraction and entity typing) easily suffer from problems including extreme label spaces, few-shot/zero-shot predictions and out-of-domain prediction. To this end, we plan to study the method for generating semantically rich label representations based on either gloss knowledge or structural knowledge from a well-populated lexical knowledge base, and further leverage SOTA natural language inference model to foster more robust and generalizable inference for information extraction in an open domain.
Prof. Aleksandra Korolova
1. Differential Privacy - from Theory to Practice: Differential privacy has emerged as the most promising approach to privacy-preserving data collection and analysis. Although recently differentially-private algorithms have been deployed by Google (https://github.com/google/rappor) and Apple, those deployments are limited in the kinds of use cases they can address.
In this project, we will address several of the barriers to making differentially private algorithms universally useful. We will develop new algorithms for differentially-private on-device machine learning while making modeling assumptions appropriate for medium-sized companies collecting data. The project will consist of algorithm design and analysis, prototype implementation, and experiments measuring performance under various assumptions. We'll aim to have the project's findings published in a top-tier privacy or machine learning venue and adopted in practice by companies interested in providing strong privacy guarantees.
2. Ensuring Fairness in Machine Learning: Data-driven algorithms and machine learning are increasingly used in systems that make decisions about and on behalf of people. As these algorithms become more common and more complex, it is crucial to understand their inherent risks, such as codifying and entrenching biases, reducing accountability, and creating new types of discrimination. In this project, we will take a principled approach to auditing the algorithms or machine learning models used by a major online service provider with an eye to identifying unexpected risks. We will quantify the identified risks and factors influencing them, and research algorithmic modifications and transparency options that could help remedy them.
Prof. Yan Liu
Machine learning methods have had great success over the last decade in learning complex representations of data that enable novel modeling and data processing approaches in many scientific disciplines. However, research has also shown that the purely data-driven deep networks are often brittle to distributional shifts in data: it has been shown that human-imperceptible changes can lead to absurd predictions. In many application areas, including physics, this motivates the need for robustness and interpretability, so that models can be trusted in practical applications. As a potential solution, incorporating domain knowledge within the model or learning process as an inductive bias has been actively studied for robustness and interpretability. Furthermore, domain knowledge informed data-driven models can be also beneficial for scientific discovery such as causal reasoning and probabilistic inference. In this project, we are interested in how physics-based knowledge can be used for efficient learning, robustness, scientific discovery, and interpretability of deep networks.
Prof. Meisam Razaviyayn
1. Designing quantized neural networks. In this work, we explore optimization algorithms that can systematically decide on quantization level and train neural networks with quantized values.
2. Defense against adversarial attacks in neural networks. In this work, we developed algorithms with theoretical performance guarantees for defense against adversarial attacks in neural networks. For more details, see https://arxiv.org/pdf/1902.08297.pdf
Prof. Srivatsan Ravi
Research Agenda: https://cpb-us-e1.wpmucdn.com/sites.usc.edu/dist/5/239/files/2019/05/research-statement.pdf
1. Concurrency challenges in Cryptocurrencies: A privacy and security perspective on building next generation Smart Contract ecosystems - Smart contracts and cryptocurrencies promise the reinvention of the monetary circuit by applying tamper-resistant computer system concepts to the financial sector, thus introducing efficiency and communal monitoring of real time transactions. While the span of potential future applications are promising, open research problems remain to be addressed in order to transition to real-world applications, the most crucial being the end-to-end security of transactions. Our research will focus on addressing security, privacy and scalability challenges that are inherent to emerging smart contract systems. We study existing ecosystems like Bitcoin, Ripple and Ethereum. We identify a holistic approach to hardening the design and implementation of smart-contracts against the ever-increasing attack surface of rapidly evolving hardware and software trends.
2. Distributed Network Algorithmics - The ever increasing availability of IoT devices and applications necessitates communication protocols that scale commensurately. The management and control of these interconnecting networks is heavily strained as billions of transitive IoT devices are deployed and new applications are dynamically introduced into enterprise networks. Pprojects on distributed network algorithmics address algorithmic protocols that are the heart of the modern networking stack from the lens of distributed computing theory, systems and verification. These projects concern: (i) resilient and robust protocols for enforcing consistent data plane updates via distributed network controllers; (ii) algorithms for smart, secure and scalable edge-computing infrastructure; (iii) distributed estimation of network topology and provenance of network state; and (iv) architectures for high-fidelity and programmable network experimentation infrastructure.
3. Concurrent data structures design, verification and analysis - To meet modern computational demands and to overcome the fundamental limitations of computing hardware, the traditional single-CPU architecture is being replaced by a concurrent system based on multi-cores or even many-cores. Therefore, at least until the next technological revolution, the only way to respond to the growing computing demand is to invest in smarter concurrent algorithms. Synchronization, one of the principal challenges in concurrent programming, consists in arbitrating concurrent accesses to shared data structures: lists, hash tables, trees, etc. Intuitively, an efficient data structure must be highly concurrent: it should allow multiple processes to “make progress” on it in parallel. This project delves into the designing lower and upper bounds for such concurrent data structures and formal verification for the correctness of the commensurate algorithms.
Prof. Mukund Raghothaman
Areas of Research: The focus of my research lab is on programming technology: how do we help programmers write better code, automatically find bugs and security vulnerabilities, and help programmers make sense of large, complex codebases? We are applying cutting-edge ideas from machine learning and artificial intelligence to these and other fundamental challenges in programming languages.
1. Can we use reinforcement learning systems to find bugs in programs, at a fraction of the computing cost and human effort expended by the current generation of tools?
2. Can we automatically synthesize pieces of code using ideas such as gradient descent? In both cases, the fundamental challenge is in finding good program representations. Potential solutions would have outsize impact, by enabling programmers to offload tedious portions of their code to constraint solvers, and by enabling programming systems to comprehend the code that they are processing.
The projects will be mathematically rigorous and involve hands-on programming. We will be reading papers, brainstorming ideas, and presenting our solutions in writing. Therefore, the ideal candidate will be comfortable designing and reasoning about algorithms --- both formally and heuristically --- and also confident in designing moderately large codebases, and hacking into existing software projects. Prior experience in programming languages, verification, or static analysis is not necessary: we will review prerequisite ideas as they arise.
In return, the candidate can expect to gain experience in applying powerful ideas from artificial intelligence and constraint solving, and gain a deeper understanding of programming technology --- compilers, debuggers, fuzz testers, and static analyzers --- that form the toolkit of professional software engineers.
Prof. Daniel Garijo
Areas of Research: eScience, Scientific Software, Knowledge Representation and Semantic Web.
1. Enhancing Scientific Software Knowledge Graphs with existing knowledge on the Web - Knowledge Graphs are becoming an increasingly important mechanism to represent data in a machine-readable way, ease querying and facilitate linking resources in a distributed environment. Wikidata, a crowdsourced knowledge graph derived from Wikipedia, is one of the biggest open knowledge graphs available to date, with thousands of curators and millions of available entities. In this project the student will work closely with knowledge engineers at ISI to leverage the contents of Wikidata to improve an existing knowledge graph of scientific software.
The student will learn the following technologies: using REST APIs, RDF, SPARQL.
Recommended prior knowledge: Python or Java, basic knowledge representation.
2. Towards a semantic representation of Docker images - Software usability is a critical issue for adopting and building on existing scientific methods. Docker is a popular virtualization environment that helps developers set up and reuse existing software. Developers reuse the software images created by other developers via "Dockerfiles" specifications, which are often available on the web in repositories such as DockerHub (https://hub.docker.com/). However, searching over these images is difficult, and as a result, researchers often explore their contents manually. In this project the student will apply automated extractors to generate machine-readable metadata representation of software images to represent them as a knowledge graph and query them efficiently.
The student will learn the following technologies: RDF, ontologies, SPARQL , Docker.
Recommended prior knowledge: Python or Java, basic knowledge representation.
3. Automated metadata extraction from scientific software documentation - Scientific software is critical for understanding and reproducing scientific results. But executing software made by others requires scientists to spend a lot of time reading over documentation and set up instructions. In previous work, we developed an approach to classify the important parts of documentation automatically. In this project the student will leverage our previous findings by applying new classifiers and building a user interface to help users classify new software documentation.
Recommended prior knowledge: Web techonologies, Python or Java, basic knowledge representation.
4. A semantic search engine for scientific software - Despite the recent advances in open science, searching for scientific software results is still a complex issue that requires manual work. However, little by little, developers are increasingly exposing scientific software metadata on the Web. In this project the student will build a crawler to detect the descriptions created by developers to collect them in a metadata index that can be queried and browsed by anyone.
The student will learn the following technologies: using REST APIs, RDF, SPARQL, .
Recommended prior knowledge: Python or Java, basic knowledge representation.
Prof. Peter Beeral
1. Software/CAD in the areas of low-power design, latch-based design, and asynchronous design. Students will learn about state-of-the-art techniques to attack power consumption in these domains and contribute to computer-aided-design tools to improve the power. Students should have experience with commercial ASIC flows.
2. Software/HW design in the area of superconducting electronics. With the end of Moore’s law search for the next generation technology is extremely important. We are working to see if superconducting electronics run at ultra-low temperatures (run at 4 degrees Kelvin!) is one area we are looking at and new software tools are circuits are needed to increase the scale of possible designs. Strong programming experience along with knowledge of open-source and/or commercial computer aided design tools is desired.
3. Machine learning acceleration. Students with strong machine learning background and ASIC/FPGA design will be considered to help our team build efficient machine learning accelerators for DNN/CNN/RNN/LSTMs.
Prof. Quan T Nyugen
1. Control, optimization and machine learning for achieving extremely robust locomotion on quadruped robots
2. Design and control of a hybrid wheel-leg robot, toward the future of delivery robots
3. Collision avoidance in emergency cases for self-driving cars
Prof. Jonathan May
Area of Research: Natural Language Processing
1. Cross-Lingual Information Retrieval, Information Extraction, and Machine Translation in low-resource languages - We use transfer learning, language universals, and related language information plus new, less data-hungry models to build state of the art NLP systems when we have little to no supervised data apart from a few over-studied languages.
2. Creative dialogue generation over email to combat phishing attacks - We try to stop phishing by intercepting attempts using an agent that pretends to be a phisher’s intended victim.
Prof. Kristina Lerman
1. Scientific collaboration network analysis - Scientific collaborations underpin most of scientific progress and innovation. In the last few years bibliographic data sets became available that allow us to study how these collaborations form and evolve. No prior experience with network analysis is required. Python and experience with scientific computing (Matlab, scikit learn packages) is preferred.
2. The data science of poverty - Understanding the macroeconomic underpinnings of poverty is complicated by interactions between various explanatory variables. We are developing machine learning methods to unbias variables to remove the impact of potential confounders (race or education). Such fair analysis promises to better explain relationships between different demographic factors and poverty-related outcomes. Interns should be proficient in basic statistics and python. We will download and analyze a variety of data sets related to poverty. This is an opportunity to learn how to analyze heterogeneous big data.
Prof. Mayank Kejriwal
1. Social Media Sensing: Social media like Twitter is a valuable source of crowdsourced data that can help us understand situations on the ground well before a formal organization (such as a news broadcaster) disseminates information on it. During, and in the aftermath of, natural disasters, such as the Kerala floods in India or the 2015 earthquake in Nepal, accurate and semi-automatic techniques for sensing 'signals' from social media streams, including urgent needs, is an important problem that can be used to avert tragedies in a timely fashion. In this project, we look to define, model and build such social media sensors on existing disaster Twitter datasets. Techniques must be adaptable (to multiple disasters) and robust (able to deal with irrelevant data), while requiring little to no training data.
2. Devising and Evaluating Large-scale Twitter Embeddings: Word embeddings (like GloVE and fastText) and graph embeddings (node2vec, DeepWalk) have become very popular paradigms in machine learning for unsupervised representation learning over myriad datasets. In the real world, data is often not words, documents or graphs, but heterogeneous data that is a mix of all three. Embedding such datasets, of which Twitter is a good example, is not only challenging from an algorithmic standpoint, but also difficult to precisely evaluate. This project seeks to learn multiple kinds of embeddings over heterogeneous, large-scale Twitter corpora, and to devise appropriate means of evaluating these embeddings. The research agenda draws on both engineering and scientific principles.
Prof. Jayakanth Ravichandran
Areas of Research: Electronic and photonic materials and devices for emerging applications
1. Negative capacitance for emerging electronics: This project will involve the design, fabrication and device characterization of heterostructure gate dielectrics showing the negative capacitance effect for use in high mobility field effect transistors for logic and memory applications. There can be a theory component for the project too.
2. Phase change oscillator for neuromorphic applications: This project will leverage metal-to-insulator phase change to achieve high frequency electrical oscillators with small footprint and low power. This device will enable highly scaled neuromorphic circuits. Both theory and experimental studies will be carried out.
3. Development of giant linear and nonlinear photonic materials: This project will develop new linear
and nonlinear photonic materials with giant susceptibilities. We have already demonstrated a world record birefringence in a quasi-1D chalcogenide (BaTiS3). This experimental work will continue and buildon this effort.
Prof. Joseph Lim
Areas of Research: Reinforcement learning, robotics, deep learning, meta learning, imitation learning, and hierarchical reinforcement learning.
Prof. Bhaskar Krishnamachari
Group Description:The Autonomous Networks Research Group directed by Prof. Bhaskar
Krishnamachari, seeks bright undergraduate students with backgrounds in electrical engineering, computer science and mathematics for research into the Internet of Things, wireless robotic networks, connected and autonomous vehicles, and other next generation wireless networks. Projects will involve a mix of mathematics, simulation and testbed experiments, tailored to student background and interest, and provide a strong experience in graduate-level research. Previous summer interns with this group have gone on to Ph.D. programs at top places including UC Berkeley, Columbia, MIT, Princeton, UCLA, USC, UT Austin, Stanford, U. Michigan, UIUC.
Prof. Michelle Povinelli
Areas of Research: Study of microphotonic devices for alternative approaches to memory, logic, and computation at high temperatures.
Projects: Electromagnetic simulation of microfabricated devices and experimental device testing.
Prof Haipeng Luo
Areas of Research: bandit optimization, reinforcement learning, and online learning/optimization.
Prof. Richard Leahy
Project Areas: Brain imaging, signal processing, and machine learning. The human brain processes rich and complex information in a multifaceted world. This project is focused on developing and validating a mathematical and statistical framework based on the BrainSync transform to address the challenges inherent in analyzing multisubject brain imaging data during resting, naturalistic stimulation and self-paced activity paradigms. The project will involve development of novel mathematical and machine learning algorithms, programming, and experimentation on human brain imaging data.
Prof. Xiang Ren
Areas of Research: Natural Language Processing, Machine Learning, Data Mining
Prior-informed, Label-Efficient Deep Learning for NLP:
In the Intelligent and Knowledge Discovery (INK) Lab at USC, led by Prof. Xiang Ren, we are interested in developing deep learning methods that can consume small amount of human-annotated data and can effectively encode human’s prior/domain knowledge in the form of symbolic structures, with the applications to natural language understanding tasks such as question answering, natural language inference, and information extraction. In particular, we are excited about machine learning problems in the space of modeling sequential/graph-structured data with weak supervision and prior knowledge. This includes neural-symbolic learning, learning with noisy data, zero/few-shot learning, and transfer learning.
Prof. Mike Shuo-Wei Chen
1. High performance, low power analog circuit design
2. Machine learning assisted circuit design
3. Bio-inspired computing hardware
Prof. Assad A Oberai
Projects: PDE-based models for unsupervised learning- Recent developments have established a close connection between the big(infinite)-data limit of graph-based unsupervised learning algorithms and partial differential operators. This leads one to consider whether computational methods used for solving PDEs can play a role in constructing new, more robust and efficient algorithms for solving unsupervised learning problems. Over the period of a summer, the student will explore this connection and will help develop and implement these PDE-based algorithms for solving unsupervised and semi-supervised learning problems.
Prof. Shinyi Wu
Projects: Our project is Markov computational modeling and simulation for affective disorders. This is a multiyear research agenda that would include a theory development part, a computational/simulation part, and an empirical data collection part. In particular, we would like students who are interested in a training in building computational psychiatry Markov models.
Prof. Cyrus Shahabi
1. Empirical Evaluation of Geo-Indistinguishability Mechanisms - Mobile users interact very frequently with location-based apps such as maps, ride-sharing services, geosocial networks, etc. The disclosure of unprotected location data can lead to serious privacy breaches related to one’s health or financial status, political or religious orientation, etc. Geo- Indistinguishability (GeoInd) is emerging as a promising model for protecting location privacy, but existing techniques that implement it are either too slow to be used in mobile apps, or introduce too much data distortion, decreasing apps’ usability. In this project, students will implement and evaluate experimentally the performance and accuracy of existing techniques for GeoInd on a broad set of real- life datasets, and under a diverse set of parameter settings. The objective of the project is to identify which technique is suitable for what specific use-case scenario, and how one should set system parameters to achieve desired system goals. Required Skills: strong coding skills (C/C++, Java), good mathematical background on computational geometry and statistics.
2. Performance Evaluation of Searchable Encryption Techniques - The emergence of cloud computing led to a trend where data are stored and processed at entities that may not always be trusted. Since large amounts of data about individuals are outsourced to the cloud, serious concerns arise regarding privacy. Even in cases where the cloud service provider is not malicious, a security breach by a malicious adversary can lead to the disclosure of private individual data. To address this threat, a number of encryption techniques have been proposed which allow both secure storage of data, as well as query processing directly on the ciphertexts. Some basic operations such as exact match, range queries, or evaluation of inner-products are currently supported by such cryptographic primitives, in either the symmetric or asymmetric cryptography setting. However, all these techniques have a significant performance overhead, which questions their practical applicability. In this project, students will implement several prominent techniques for searchable encryption, and thoroughly evaluate their performance for a broad set of datasets and a diverse set of parameter settings. The objective is to understand the performance overhead of searchable encryption, and attempt to optimize the performance under certain use case scenarios (e.g., by identifying certain parameter settings that reduce overhead, or by employing parallelism where possible). Required Skills: strong programming skills (Python, C, C++, Java), good background on number theory.
Prof. Sandeep Gupta & Prof. Pierluigi Nuzzo
1. Developing custom computing solutions for inverse problems: Inverse problems span many application domains, including combinatorial optimization and bio-medical imaging. Professors Sandeep Gupta and Pierluigi Nuzzo are working on two different approaches for building new custom computing solutions for SAT (satisfiability), the key inverse problem in the logic domain to which a wide range of important applications can be reduced. Specifically, their current approaches go beyond the common paradigms for hardware acceleration (namely, lower hardware delays, harnessing higher core-/logic-level parallelism, and reducing data movement) and embody new paradigms, especially harnessing electrical-level parallelism (e.g., achieving n-way broadcasts over interconnects and into memories at O(log(n)) delays), and developing new VLSI circuits that directly implement inverse functions.
2. Self-Driving Vehicle Testbed : The goal of this project is to build an experimental testbed to emulate realistic scenarios for self-driving vehicles and test the effectiveness of different driving algorithms. The testbed will target a traffic intersection and will include a set of scaled-down autonomous cars, a programmable traffic light sequencer to emulate the traffic and pedestrian signals, and a set of robots to emulate pedestrian traffic. The students will closely collaborate with USC Viterbi faculty and Ph.D. students to define the architecture of the testbed, define and assemble the different components, implement the driving scenarios on the testbed, and collect data. Activities will include programming the driving algorithms in a simulation environment as well as on embedded microcontrollers (e.g., on Raspberry Pi boards).
3. High-Assurance Design of Safety-Critical Autonomous Systems with Machine Learning Components : Autonomous systems are particularly desirable for a variety of applications, such as driverless cars, spaceflight, household maintenance, and delivery of goods and services. To achieve high degrees of autonomy and operation in uncertain environments, these systems will increasingly adopt machine learning algorithms. These algorithms have achieved human-level performance or better on a number of tasks; however, their deployment in safety-critical applications brings additional sources of approximations that require formal analysis and design methods to ensure that the implemented system is safe and avoid undesired outcomes. The goal of this project is to develop, simulate, and test scalable analysis and design procedures that can guarantee correct operation of safety-critical autonomous systems.
4. Security-Driven Optimized Obfuscation of Integrated Circuits : Integrated circuit obfuscation consists of a set of techniques that are used to prevent the reverse-engineering of integrated circuits and the insertion of hardware Trojans. Several obfuscation techniques have been developed over the years, but there is little agreement on metrics to validate the security claims, or tools to select and assess the effectiveness of obfuscation. The goal of this project is to develop an obfuscation design methodology and unified metrics which treat obfuscation security as a first-class design constraint and enable the selection of an appropriate mix of techniques to satisfy a set of security and performance objectives.
Prof. Manuel Monge
Areas of Research: Integrated Circuits (ICs) for Medical Electronics, Miniature Medical Devices, Neural Interfaces.
Group Description: We combine and integrate physical and biological principles into the design of integrated circuits to engineer miniature biomedical devices for fundamental research, medical diagnosis and treatment.
1. Bidirectional Neural Interfaces - Neural interfaces directly interact with neurons in our body. Neural recordings of brain activity are one of the key elements for studying the brain, which has led to remarkable breakthroughs in science, engineering and medicine. Similarly, neural stimulation of key brain regions has enabled treatment of medical conditions such as Parkinson's disease, epilepsy, and others. Currently approved neural interface devices have limited bandwidth and spatial coverage of the brain, with up to 10s or 100s of channels in the system. The majority of these channels are dedicated to recording and only a few to stimulation. Future high-density, high-bandwidth bidirectional neural interface systems will support thousands of channels for simultaneous recording and stimulation, and could provide real-time visualization of multiple brain-regions with high temporal and spatial resolution. The summer intern can work on various aspects of neural amplifiers and neural stimulators ICs including design, simulation, and measurements of components and systems.
2. Location-Broadcasting Bio-Chips - The function of miniature wireless medical devices such as capsule endoscopes, biosensors, and drug-delivery systems critically depends on their location inside the body. However, existing electromagnetic, acoustic and imaging-based methods for localizing and communicating with such devices are limited by the physical properties of tissue or performance of imaging modality. We recently developed a new class of microchips for localization of microscale devices by embodying the principles of nuclear magnetic resonance in a silicon integrated circuit. We mimicked the behavior of nuclear spins and engineered miniaturized RF transmitters that encode their location in space by shifting their output frequency in proportion to the local magnetic field. The application of external field gradients then allows each device to be located precisely from its signal's frequency. This technology is inherently robust to tissue properties, scalable to multiple devices, and suitable for the development of microscale devices to monitor and treat disease. The summer intern can work on various aspects of this new technology including design, simulation, and measurements of low-power analog and mixed-signal ICs, and external multi-channel RF receivers for interfacing with these new devices.
3. Wireless Implantable Biosensors - Implantable biosensors are emerging as new devices capable of continuous in vivo monitoring of clinically relevant biomarkers. As these devices become smaller, reducing the power consumption while maintaining or improving performance is paramount. We focus on developing high-sensitivity, high-dynamic range, and low-power micro-scale electrochemical sensors for measurement of different biomarkers such as glucose, proteins, and ions. The summer intern can work on various aspects of wireless implantable biosensor ICs including design, simulation, and measurements of components and systems.
Prof. David R. Traum
Projects: Natural Language Dialogue Systems - we are interested and working on many areas involved in improving performance and extending the capabilities of such systems. Unlike many labs that focus strictly on assistant systems or chat, we have extensive research efforts on the following kinds of applications (as well as others): Virtual Human Dialogue Dialogue with Robots Roleplay dialogue Conversations with History and Heroes (video of an actual person) Dialogue in Games Extended dialogue interaction Storytelling dialogue
Prof. Alice C. Parker
Areas of Research: Neuromorphic Circuits to Model Neurons in the Human Brain.
Group description: My group researches nanotechnologies and analog electronic circuits that capture brain-like behavior.
1. Neuromorphic circuits that model neurons in the cortex that learn new skills without forgetting.
2. Circuits incorporating nanotechnology models (carbon nanotubes, memristors, molybdenum disulfide) to model neurons.
3. Neuromorphic circuits that model neurons in the cortex of a robotic cat. Previous IIT interns have been lead authors on conference papers resulting from their internships. Background in high-school biology is sufficient for the project.
Published on September 18th, 2017
Last updated on May 28th, 2022