Selected publications, tools, and academic projects.
Accepted technical presentation in the “Kokkos in Applications” session (HPC performance optimization)
Conference presentation (slides - mute)
Conference presentation (slides with animations and schlieren videos, mute)
Contributed to PlaidML, an MLIR-based tensor compiler targeting heterogeneous hardware (NVIDIA, AMD, Intel GPUs/CPUs, Metal). Developed and extended the Tile eDSL (C++/Python) to express tensor computations and integrate with ONNX Runtime, OpenVINO, and TensorFlow.
Key contributions:
Example workload: SciTile implementation of grid-based surface and volume integrals , based on the formulation in “Treat All Integrals as Volume Integrals” .
Images © Intel Corporation. Source: Intel COMPUTEX 2019 Press Materials
Developed a face-recognition module for Intel's ambient computing concept platform, demonstrated at COMPUTEX 2019. The system used a 360° camera to identify users and dynamically adjust the device interface: when closed, recognized users were presented with contextual information on an auxiliary touchbar display; when opened, the system authenticated and logged users into the primary OS.
Implementation involved cross-OS communication, real-time inference integration, and synchronization between device drivers, camera pipelines, and UI subsystems—requiring performance-sensitive, reliability-critical software design for a dual-OS shared peripheral architecture.