1

PyLO: Towards Accessible Learned Optimizers in PyTorch

A PyTorch library that makes learned optimizers accessible to the broader ML community with CUDA-accelerated implementations with substantial speedups (5x improvement for ViT training) and integrates seamlessly with existing PyTorch workflows, enabling practical application of learned optimization to real-world large-scale tasks.

Paul Janson, Benjamin Therien, Quentin Anthony, Xialong Huang, Abhinav Moudgil, Eugene Belilovsky

Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training

We demonstrate that infinite learning rate schedules consistently outperform widely-used repeated cosine decay for continual pre-training under distribution shifts across both vision and language models, providing a more effective alternative for large-scale self-supervised learning without catastrophic forgetting.

Paul Janson, Vaibhav Singh, Paria Mehrbod, Adam Ibrahim, Irina Rish, Eugene Belilovsky, Benjamin Therien

Towards motion from video diffusion models

This study investigates the capabilities of video diffusion models in generating human motion from text prompts, revealing their strengths in common motions and limitations in rare or complex movements.

Paul Janson, Tiberiu Popa, Eugene Belilovsky

Towards motion from video diffusion models

Continual zero-shot learning through semantically guided generative random walk

Learning novel concepts, remembering previous knowledge, and adapting it to future tasks occur simultaneously throughout a human’s …

Wenxuan Zhang, Paul Janson, Divyansh Jha, Kai Yi, Ivan Skorodov, Mohammed Elhoseiny

Continual zero-shot learning through semantically guided generative random walk

Overcoming Generic Knowledge Loss with Selective Parameter Update

Adding knowledge to the model without destroying its generalization by finetuning small set of parameters

Wenxuan Zhang, Paul Janson, Rahaf Aljundi, Mohammed Elhoseiny

Overcoming Generic Knowledge Loss with Selective Parameter Update

Domain Aware Zero shot learning

Continual zero-shot learning involves learning seen classes incrementally while improving the ability to recognize unseen or …

Kai Yi, Paul Janson, Wenxuan Zhang, Mohammed Elhoseiny

Domain Aware Zero shot learning

A Simple baseline that questions the use of pre-trained model in continual learning

A baseline that performs better without training in continual learning benchmarks

Paul Janson, Wenxuan Zhang, Rahaf Aljundi, Mohammed Elhoseiny