These two articles helped clear up a lot of my confusion:
Ubuntu Docker + python3
I recently had to do a quick test of using python with ubuntu. I decided to use docker.
steps:
sudo docker run -it ubuntu bash
apt-get update
apt-get install python3-pip
# python3 --version
Python 3.8.5
to load up other stuff
sudo docker run -it -v $HOME:/work pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel bash
sudo docker run -it --ipc=host --rm -v $HOME:/work --privileged pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel bash
sudo docker run -it -v $HOME:/work py37_pytorch16_dte bash
sudo docker run -it -v $HOME:/work py37_trch16_trfmr43 bash
Beyond Integer indexing
Faced an interesting problem recently
a : (B, S, T)
b : (B, C) where 0 <= x[i, j] < S
What I want is an array of shape (B, C, T)
a = np.array(
...: [[[0,1,2,3],
...: [4,5,6,7],
...: [8,9,10,11]],
...: [[0,1,2,3],
...: [4,5,6,7],
...: [8,9,10,11]]])
b = np.array(
...: [[0,2,2],
...: [1,0, 2]])
a.shape
Out[79]: (2, 3, 4)
b.shape
Out[80]: (2, 3)
What I expect is this
array([[[ 0, 1, 2, 3],
[ 8, 9, 10, 11],
[ 8, 9, 10, 11]],
[[ 4, 5, 6, 7],
[ 0, 1, 2, 3],
[ 8, 9, 10, 11]]])
Note this is different from the typical scenario
Initially I hit some issues with integer index broadcasting. It seems it is possible to do it.
a[np.array([np.arange(2)]).T, b]
References:
PyTest live logging in PyCharm
PyTest does allow output to be ‘live printed’
Also, it possible to see logging output in PyTest
Checkout these two links:
Updating the transformers package
Few steps I follow each time i update the transformers package
git pull pip install --upgrade . pip install -r ./examples/requirements.txt
that’s
Latex multirow and column
Short and intuitive example
Numpy RuntimeWarning
Something I learns recently..
- NumPy has its own internal warning architecture on top of Pythons, which can be specifically controlled
- So, something Numpy will just produce a
RuntimeWarning
without actually throwing an exception
Consider this:
probs = np.array([0.0, 1.0]) np.prod(probs)**(-1/len(probs))
Numpy produces a RuntimeWarning, not an exception
References:
Gradient accumulation in PyTorch
Need to understand:
- https://medium.com/huggingface/training-larger-batches-practical-tips-on-1-gpu-multi-gpu-distributed-setups-ec88c3e51255
- https://discuss.pytorch.org/t/why-do-we-need-to-set-the-gradients-manually-to-zero-in-pytorch/4903/20
- https://gist.github.com/thomwolf/ac7a7da6b1888c2eeac8ac8b9b05d3d3
tiling and repeating tensors
Repeat entire tensor:
Repeat elements of the tensor
- PyTorch:
- Numpy:
pack_padded_sequence
Nice article on the use of pack_padded_sequence
in Pytorch
https://suzyahyah.github.io/pytorch/2019/07/01/DataLoader-Pad-Pack-Sequence.html