Reinforcement Learning for Task and Motion Planning in Mobile Home Robots

I utilized a Double Deep Q-Network learning approach to complete complex sequential tasks in home environments. The model was trained in unique environments on a variety of tasks that include motion and interactable tasks. Custom state one-hot encoding was used to represent the robot's state space in relation to the environment to accommodate a real-time learning environment. Testing was conducted in the simulated environments to demonstrate the model's ability to learn the optimal policy for minimal actions and no failures.

Source code here

Find the full report here

Lifelong Learning for Task and Motion Planning in Mobile Home Robots

Dual policy task and motion planning approach to catastrophic forgetting, the forgetting of task completion when placed in a new environment. One policy would guide the motion using a RRT. The other policy is trained using a limited memory replay buffer to simulate the memory constraints of a robot. This policy would ensure that the loss of previous tasks did not increase when learning new tasks using Gradient Episodic Memory. This model suffered from poor exploration, which was addressed in the reinforcement learning version of this.

Source code here

Find the full report here

Deep Neural Network Verification

Verification of various deep neural networks, including MNIST, CIFAR, and GTSRB using α-β-CROWN and NeuralSAT. Developed a sweeping ε method to analyze robustness properties for optimal verification and adversarial example detection. Analyzed how verification varies with different network architectures and image complexity.

Source code here

Find the full report here

Qiskit-Aer Quantum Computer Simulator Compiler Optimization

Developed gate cancellation and qubit operation clustering for integration in high performance GPU accelerated quantum computer simulator.

Gate Cancellation: Statically determined qubit basis state to eliminate CNOTs. Average of 25% decrease in gate count
Gate Clustering: Circuit reorganization to cluster operations on the same qubits to reduce costly CPU-GPU data exchanges. Static postpone dataflow analysis used to achieve 30% speedups in execution time

Source code here

Find the full report here

Machine independent optimized C compiler

For a subset of the C programming language, I built a compiler that scans, parses, builds intermediate representations, and optimizes the code. Scanner, parser, and abstract syntax tree (AST) are built using C++. 3 address code and all optimizations are implemented using Python3.

Features

Scanner that finds syntax errors
Parser that builds an abstract syntax tree (AST) Error recovery is implented using Wirth's algorithm so that after an error, the program will recover to find additional errors that are displayed to user.
From AST, 3 address code is generated, which is machine independent, not control flow bound low level code
Using 3 address code, our basic blocks and control flow graph are generated to perform optimizations and thus data flow analysis
Optimizations: Local Value Numbering, Dead Code elimination, Partial redunancy elimination

Autonomous 7 degree of freedom robot

Robot autonomously performs grasping and placing of objects in a world environment. World environment is built using sensors as state space is discretized over pi/16.0 radians. Rapidly exploring random trees (RRTs) are used to sample the state space and find shortest path. This is done by randomly sampling in robot link and joint limits then finding the nearest constructed node. A new node is added pi/16.0 radians in the direction of the sampled node and creates an edge between the nearest and new nodes. The back reference pointer to this node is then tracked to find the shortest path.

House automation

A variety of projects for my college house that improves the quality of life. Almost all these projects are run using Raspberry Pi's

Sound request station
Multi-user jukebox site that can be accessed through people's phones that are authorized. Authorization is done through connection to home wifi network. Touchscreen tablet is also used to create a real jukebox. Spotify API is then accessed to add songs to queue after passing through my filter.
Card swipe access to house locks
- Using our URID cards, we can swipe, using an electromagnetic card reader, into the house, unlocking or locking the door
- A servo is used to control the rods that rotate the deadlock.
- Mechanical locking is supported as the program can recognize if the state of the lock is different from last usage.

Scott Sikorski

sgsikorski20@gmail.com

nqj5ak@virginia.edu

Reinforcement Learning for Task and Motion Planning in Mobile Home Robots

Lifelong Learning for Task and Motion Planning in Mobile Home Robots

Deep Neural Network Verification

Qiskit-Aer Quantum Computer Simulator Compiler Optimization

Machine independent optimized C compiler

Features

Autonomous 7 degree of freedom robot

House automation