For my posts before 2022, please visit https://forrestbao.blogspot.com
(This is the first post of my series of thoughts on the limitations of Jupyter notebooks, the tool that every AI/ML/Data person uses.)
First, forget about solutions using
iwconfig(which does not support WPA) or
wpa_supplicant. They are outdated.
I have begun my journey to hack into Jupyter. The first step is to understand how to send code to a Jupyter kernel and get the results back.
- The input of a question generator can contain the answer (e.g., a span in the input document) or not. The former is answer-answer while the latter is answer-agnostic, or without-answer-supervision. The answer-agnostic case is similar to summarization.
- Shallow vs. Deep QG: Shallow, also called low cognitive demanding, if a question’s answer can be found in one sentence and/or is explicitly given. Recently, the focus shifts to multi-hop QG, where the answer can only be obtained by inferencing on multiple sentences. Thus, the questions are considered high cognitive demanding (HCD) or deep.
- The output of a question generator can be sentences or multiple choices. In the multiple choice case, a key is generate good distractors.
A friend recommended me a library from LightingAI called Lit-GPT for finetuning GPT-style/generative/causal language models. I gave it a try. This blog post is about my experience.
Blogs & review papers
- https://lightning.ai/pages/community/article/understanding-llama-adapters/ (I highly recommend this blog post which has good graphics and code to explain the PEFT methods)
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
I recently need to access the commands and titles of VSCode plug-ins/extensions.
Jupyter is cool. It takes advantage of scripting or interpretting that variables, functions, and types/classes that you defined will remain in the memory for you to access handily. Jupyter wraps a notebook layer for that experience.
Z690 motherboards for multi-GPU workstaions
Demystifying PCIe lanes in GPU computing for Deep Learning: Forgetting about AMD Threadrippers and Intel X-series
- I am building a multi-GPU workstation of my own.
- I previously owned an Intel i9 10980XE CPU.
- And I am here to tell you that the 48 PCIe lanes of Intel X-series CPUs and the 128 lanes of AMD Threadripper CPUs are overkills for multi-GPU computers despite that the GPUs can be most effective with 16 PCIe lanes each.
- Just get a regular desktop CPU from Intel or AMD (with Intel’s perferred), paired with a proper chipset.
I have a few RTX 3090 GPU cards and few students working with me. I used to give one 3090 card to one student for running his own experiment on his own computer. Recently, we decide to pool those cards together to one workstation. The solutions from vendors fail to make me happy. So I decide to buy a CPU and a motherboard that can host multiple 3090 cards.