Someone construct model with BERT, LSTM and CRF, like this BERT-BiLSTM-CRF-NER, but in theory, the BERT mechanism has replaced the role of LSTM, so I think LSTM is redundant.
For the performance, BERT+CRF is always a little better than single BERT in my experience.