References
707 distinct citations across the book (1498 total occurrences). Click an entry to open the verified paper (or a Scholar search if not yet verified).
- (1748). An Enquiry Concerning Human Understanding. London: A. Millar. https://davidhume.org/texts/e/ ↗×1
- (1843). A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. London: John W. Parker. https://www.gutenberg.org/ebooks/27942 ↗×1
- (1925). Statistical Methods for Research Workers. Oliver & Boyd, Edinburgh. https://psychclassics.yorku.ca/Fisher/Methods/ ↗×1
- (1957). Dynamic Programming. https://press.princeton.edu/books/paperback/9780691146683/dynamic-programming ↗×1
- (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. https://doi.org/10.1037/h0042519 ↗×1
- (1965). A Machine-Oriented Logic Based on the Resolution Principle. Journal of the ACM, 12(1), 23–41. https://dl.acm.org/doi/10.1145/321250.321253 ↗×1
- (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2(3–4), 189–208. https://www.sciencedirect.com/science/article/abs/pii/0004370271900105 ↗ DOI: 10.1016/0004-3702(71)90010-5×1
- (1971). On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. https://epubs.siam.org/doi/10.1137/1116025 ↗×1
- (1972). Un système de communication homme-machine en français. Rapport préliminaire de fin de contrat IRIA, Groupe d'Intelligence Artificielle, Université d'Aix-Marseille II, Luminy. https://softwarepreservation.computerhistory.org/prolog/ ↗×1
- ×1
- (1972). A combinatorial problem; stability and order for models and theories in infinitary languages. https://shelah.logic.at/papers/16/ ↗×1
- (1975). Problems of Monetary Management: The U.K. Experience. Papers in Monetary Economics, Volume I, Reserve Bank of Australia. https://www.semanticscholar.org/paper/Problems-of-Monetary-Management:-The-UK-Experience-Goodhart/0ae623749b30de53a39cf05813f5f3842e422c01 ↗×1
- (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41 ↗×3
- (1985). A Learning Algorithm for Boltzmann Machines. Cognitive Science, 9(1), 147–169. https://www.cs.toronto.edu/~fritz/absps/cogscibm.pdf ↗ DOI: 10.1207/s15516709cog0901_7×1
- (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. https://www.nature.com/articles/323533a0 ↗ DOI: 10.1038/323533a0×3
- (1988). Root-N-Consistent Semiparametric Regression. Econometrica, 56(4), 931–954. https://www.jstor.org/stable/1912705 ↗ DOI: 10.2307/1912705×1
- (1989). Learnability and the Vapnik-Chervonenkis Dimension. Journal of the ACM, 36(4), 929–965. https://dl.acm.org/doi/10.1145/76359.76371 ↗×1
- (1989). Approximation by superpositions of a sigmoidal function. https://doi.org/10.1007/BF02551274 ↗×1
- (1989). Multilayer feedforward networks are universal approximators. https://doi.org/10.1016/0893-6080(89)90020-8 ↗×1
- (1989). Backpropagation Applied to Handwritten Zip Code Recognition. https://direct.mit.edu/neco/article/1/4/541/5515/Backpropagation-Applied-to-Handwritten-Zip-Code ↗×1
- ×2
- (1990). Basic Local Alignment Search Tool. Journal of Molecular Biology, 215(3), 403–410. https://pubmed.ncbi.nlm.nih.gov/2231712/ ↗ DOI: 10.1016/S0022-2836(05)80360-2×1
- (1991). An Algorithm for Fast Recovery of Sparse Causal Graphs. Social Science Computer Review, 9(1), 62–72. https://journals.sagepub.com/doi/10.1177/089443939100900106 ↗×1
- (1991). Dyna, an integrated architecture for learning, planning, and reacting. ACM SIGART Bulletin, 2(4), 160–163. https://dl.acm.org/doi/10.1145/122344.122377 ↗×2
- (1992). Practical Issues in Temporal Difference Learning. Machine Learning, 8(3-4), 257–277. https://link.springer.com/article/10.1007/BF00992697 ↗×1
- (1992). Practical Issues in Temporal Difference Learning. Machine Learning, 8(3–4), 257–277. https://link.springer.com/article/10.1007/BF00992697 ↗×1
- ×3
- (1992). Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. https://doi.org/10.1007/BF00992696 ↗×3
- (1993). Comment: Graphical Models, Causality and Intervention. Statistical Science, 8(3), 266–269. https://projecteuclid.org/euclid.ss/1177010894 ↗ DOI: 10.1214/ss/1177010894×1
- ×1
- (1994). Okapi at TREC-3. Proceedings of the Third Text REtrieval Conference (TREC-3), NIST Special Publication 500-225, 109–126. https://trec.nist.gov/pubs/trec3/papers/city.ps.gz ↗×1
- (1994). Estimation of Regression Coefficients When Some Regressors Are Not Always Observed. Journal of the American Statistical Association, 89(427), 846–866. https://www.jstor.org/stable/2290910 ↗ DOI: 10.1080/01621459.1994.10476818×1
- (1994). On-line Q-learning using connectionist systems. https://www.semanticscholar.org/paper/On-line-Q-learning-using-connectionist-systems-Rummery-Niranjan/7a09464f26e18a25a948baaa736270bfb84b5e12 ↗×1
- ×1
- (1995). Causal diagrams for empirical research. Biometrika, 82(4), 669–688. https://academic.oup.com/biomet/article-abstract/82/4/669/251647 ↗ DOI: 10.1093/biomet/82.4.669×5
- (1995). Causal Inference in the Presence of Latent Variables and Selection Bias. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI 1995), 499–506. https://arxiv.org/abs/1302.4983 ↗×1
- (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580. https://ieeexplore.ieee.org/document/508439/ ↗ DOI: 10.1109/70.508439×1
- (1997). Long Short-Term Memory. https://direct.mit.edu/neco/article/9/8/1735/6109/Long-Short-Term-Memory ↗×5
- (1997). A PAC analysis of a Bayesian estimator. https://www.semanticscholar.org/paper/A-PAC-analysis-of-a-Bayesian-estimator-Shawe-Taylor-Williamson/7dfccfdc0b269628730a13c54ff6026b1e9b04a1 ↗×2
- (1997). An Analysis of Temporal-Difference Learning with Function Approximation. https://ieeexplore.ieee.org/document/580874/ ↗×1
- (1998). The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network. https://ieeexplore.ieee.org/document/661502/ ↗×3
- (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems, 30(1–7), 107–117. http://infolab.stanford.edu/~backrub/google.html ↗ DOI: 10.1016/S0169-7552(98)00110-X×1
- (1998). Rapidly-Exploring Random Trees: A New Tool for Path Planning. Technical Report No. 98-11, Computer Science Department, Iowa State University. http://msl.cs.illinois.edu/~lavalle/papers/Lav98c.pdf ↗×1
- (1998). Gradient-based learning applied to document recognition. https://ieeexplore.ieee.org/document/726791 ↗×1
- (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. https://projecteuclid.org/journals/annals-of-statistics/volume-26/issue-5/Boosting-the-margin--a-new-explanation-for-the-effectiveness/10.1214/aos/1024691352.full ↗×2
- ×2
- (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. https://www.nature.com/articles/nn0199_79 ↗×2
- (1999). Policy Gradient Methods for Reinforcement Learning with Function Approximation. https://proceedings.neurips.cc/paper/1999/hash/464d828b85b0bed98e80ade0a5c43b0f-Abstract.html ↗×3
- ×2
- (2000). Robotic grasping and contact: a review. Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation, 348–353. https://ieeexplore.ieee.org/document/844081/ ↗ DOI: 10.1109/ROBOT.2000.844081×1
- (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press. https://doi.org/10.1017/CBO9780511803161 ↗×1
- ×2
- (2001). Mechanics of Robotic Manipulation. MIT Press (Intelligent Robotics and Autonomous Agents series). https://mitpress.mit.edu/9780262133968/mechanics-of-robotic-manipulation/ ↗ DOI: 10.7551/mitpress/4527.001.0001×1
- ×1
- (2002). R-MAX – A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning. Journal of Machine Learning Research, 3, 213–231. https://www.jmlr.org/papers/v3/brafman02a.html ↗ DOI: 10.1162/153244303765208377×1
- (2002). Optimal Structure Identification With Greedy Search. Journal of Machine Learning Research, 3, 507–554. https://jmlr.org/papers/v3/chickering02b.html ↗×1
- (2002). Optimal Structure Identification With Greedy Search. Journal of Machine Learning Research, 3, 507–554. https://jmlr.org/papers/v3/chickering02b.html ↗ DOI: 10.1162/153244303321897717×1
- (2002). Near-Optimal Reinforcement Learning in Polynomial Time. https://link.springer.com/article/10.1023/A:1017984413808 ↗×2
- (2002). Empirical Margin Distributions and Bounding the Generalization Error of Combined Classifiers. https://projecteuclid.org/journals/annals-of-statistics/volume-30/issue-1/Empirical-Margin-Distributions-and-Bounding-the-Generalization--Error-of/10.1214/aos/1015362183.full ↗×2
- (2002). A General Identification Condition for Causal Effects. Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI 2002), 567–573. https://cdn.aaai.org/AAAI/2002/AAAI02-085.pdf ↗×2
- ×1
- (2005). Probabilistic Robotics. MIT Press (Intelligent Robotics and Autonomous Agents series). https://mitpress.mit.edu/9780262201629/probabilistic-robotics/ ↗×3
- (2006). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. Computers and Games: 5th International Conference, CG 2006, Lecture Notes in Computer Science, vol. 4630, pp. 72–83. https://link.springer.com/chapter/10.1007/978-3-540-75538-8_7 ↗×1
- (2006). Calibrating Noise to Sensitivity in Private Data Analysis. https://link.springer.com/chapter/10.1007/11681878_14 ↗×1
- (2006). Reducing the Dimensionality of Data with Neural Networks. https://www.science.org/doi/10.1126/science.1127647 ↗×4
- (2006). A Linear Non-Gaussian Acyclic Model for Causal Discovery. Journal of Machine Learning Research, 7, 2003–2030. https://jmlr.org/papers/v7/shimizu06a.html ↗×1
- (2006). Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models. Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006). https://cdn.aaai.org/AAAI/2006/AAAI06-191.pdf ↗×3
- (2008). An analysis of model-based Interval Estimation for Markov Decision Processes. Journal of Computer and System Sciences, 74(8), 1309–1331. https://doi.org/10.1016/j.jcss.2007.08.009 ↗×1
- (2008). Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften, Vol. 338, Springer-Verlag, Berlin. https://link.springer.com/book/10.1007/978-3-540-71050-9 ↗×1
- (2008). Extracting and composing robust features with denoising autoencoders. https://dl.acm.org/doi/10.1145/1390156.1390294 ↗×4
- (2009). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems 21 (NIPS 2008). https://papers.nips.cc/paper/3548-nonlinear-causal-discovery-with-additive-noise-models ↗×1
- (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press (2nd ed.). https://doi.org/10.1017/CBO9780511803161 ↗×1
- (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923), 81–85. https://www.science.org/doi/10.1126/science.1165893 ↗×1
- (2010). Understanding the difficulty of training deep feedforward neural networks. https://proceedings.mlr.press/v9/glorot10a.html ↗×1
- (2010). Near-optimal Regret Bounds for Reinforcement Learning. https://jmlr.org/papers/v11/jaksch10a.html ↗×1
- (2010). Recurrent neural network based language model. Interspeech 2010. https://www.isca-archive.org/interspeech_2010/mikolov10_interspeech.html ↗ DOI: 10.21437/Interspeech.2010-343×1
- (2011). Deep Sparse Rectifier Neural Networks. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS 2011), PMLR 15, 315–323. https://proceedings.mlr.press/v15/glorot11a.html ↗×1
- (2011). Sampling-based Algorithms for Optimal Motion Planning. The International Journal of Robotics Research, 30(7), 846–894. https://arxiv.org/abs/1105.1186 ↗ DOI: 10.1177/0278364911406761×1
- (2011). A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS 2011), PMLR 15, 627–635. https://arxiv.org/abs/1011.0686 ↗×1
- (2012). ImageNet Classification with Deep Convolutional Neural Networks. https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html ↗×3
- (2012). Merck Molecular Activity Challenge. Kaggle Competition. https://www.kaggle.com/c/MerckActivity ↗×1
- (2012). Large Scale Distributed Deep Networks. Advances in Neural Information Processing Systems 25 (NeurIPS 2012). https://papers.nips.cc/paper/2012/hash/6aca97005c68f1206823815f66102863-Abstract.html ↗×1
- (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems 25 (NeurIPS 2012). https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks ↗×2
- (2012). ImageNet Classification with Deep Convolutional Neural Networks. https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html ↗×5
- (2012). On Causal and Anticausal Learning. Proceedings of the 29th International Conference on Machine Learning (ICML 2012). https://arxiv.org/abs/1206.6471 ↗×3
- (2012). Japanese and Korean voice search. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 5149–5152. https://research.google/pubs/japanese-and-korean-voice-search/ ↗ DOI: 10.1109/ICASSP.2012.6289079×1
- ×3
- (2013). Efficient Estimation of Word Representations in Vector Space. https://arxiv.org/abs/1301.3781 ↗×2
- (2013). Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602. https://arxiv.org/abs/1312.5602 ↗×2
- (2013). Eluder Dimension and the Sample Complexity of Optimistic Exploration. https://proceedings.neurips.cc/paper/2013/hash/41bfd20a38bb1b0bec75acf0845530a7-Abstract.html ↗×1
- (2014). Neural Machine Translation by Jointly Learning to Align and Translate. https://arxiv.org/abs/1409.0473 ↗×1
- (2014). Superintelligence: Paths, Dangers, Strategies. Oxford University Press. https://global.oup.com/academic/product/superintelligence-9780199678112 ↗×4
- (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://arxiv.org/abs/1406.1078 ↗×2
- (2014). Generative Adversarial Nets. Advances in Neural Information Processing Systems 27 (NeurIPS 2014). https://arxiv.org/abs/1406.2661 ↗×2
- ×2
- ×3
- (2014). Deterministic Policy Gradient Algorithms. Proceedings of the 31st International Conference on Machine Learning (ICML 2014), PMLR 32, 387–395. https://proceedings.mlr.press/v32/silver14.html ↗×2
- (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. https://arxiv.org/abs/1409.1556 ↗×2
- (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. https://jmlr.org/papers/v15/srivastava14a.html ↗×1
- ×1
- ×2
- (2014). Visualizing and Understanding Convolutional Networks. Computer Vision – ECCV 2014, Lecture Notes in Computer Science, vol. 8689, pp. 818–833. https://arxiv.org/abs/1311.2901 ↗ DOI: 10.1007/978-3-319-10590-1_53×1
- (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8), 831–838. https://www.nature.com/articles/nbt.3300 ↗ DOI: 10.1038/nbt.3300×1
- (2015). VQA: Visual Question Answering. Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015), 2425–2433. https://arxiv.org/abs/1505.00468 ↗ DOI: 10.1109/ICCV.2015.279×1
- (2015). On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLOS ONE, 10(7), e0130140. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140 ↗×1
- (2015). Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission. Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015), 1721–1730. https://dl.acm.org/doi/10.1145/2783258.2788613 ↗×3
- (2015). Human-level control through deep reinforcement learning. https://www.nature.com/articles/nature14236 ↗×1
- ×5
- ×1
- (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press. https://www.cambridge.org/core/books/causal-inference-for-statistics-social-and-biomedical-sciences/71126BE90C58F1A431FE9B2DD07938AB ↗ DOI: 10.1017/CBO9781139025751×1
- (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), PMLR 37, 448–456. https://arxiv.org/abs/1502.03167 ↗×1
- (2015). Deep Visual-Semantic Alignments for Generating Image Descriptions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), 3128–3137. https://arxiv.org/abs/1412.2306 ↗ DOI: 10.1109/CVPR.2015.7298932×1
- ×1
- ×2
- ×1
- (2015). Variational Inference with Normalizing Flows. Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), PMLR 37, 1530–1538. https://arxiv.org/abs/1505.05770 ↗×1
- (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, Lecture Notes in Computer Science, vol. 9351, 234–241. https://arxiv.org/abs/1505.04597 ↗ DOI: 10.1007/978-3-319-24574-4_28×1
- ×3
- (2015). Trust Region Policy Optimization. Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), PMLR 37, 1889–1897. https://arxiv.org/abs/1502.05477 ↗×1
- (2015). Deep Unsupervised Learning using Nonequilibrium Thermodynamics. Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), PMLR 37, 2256–2265. https://arxiv.org/abs/1503.03585 ↗×2
- ×2
- (2015). Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press. https://global.oup.com/academic/product/explanation-in-causal-inference-9780199325870 ↗×1
- (2015). Show and Tell: A Neural Image Caption Generator. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), 3156–3164. https://arxiv.org/abs/1411.4555 ↗ DOI: 10.1109/CVPR.2015.7298935×1
- ×2
- ×1
- (2016). Unifying Count-Based Exploration and Intrinsic Motivation. Advances in Neural Information Processing Systems 29 (NeurIPS 2016). https://arxiv.org/abs/1606.01868 ↗×1
- (2016). Training Deep Nets with Sublinear Memory Cost. arXiv:1604.06174. https://arxiv.org/abs/1604.06174 ↗×1
- (2016). Double Machine Learning for Treatment and Causal Parameters. arXiv:1608.00060. https://arxiv.org/abs/1608.00060 ↗ DOI: 10.48550/arXiv.1608.00060×1
- ×1
- (2016). Equality of Opportunity in Supervised Learning. Advances in Neural Information Processing Systems 29 (NIPS 2016), 3323–3331. https://arxiv.org/abs/1610.02413 ↗×1
- (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 770–778. https://arxiv.org/abs/1512.03385 ↗ DOI: 10.1109/CVPR.2016.90×1
- ×1
- ×1
- (2016). Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5), 947–1012. https://arxiv.org/abs/1501.01332 ↗ DOI: 10.1111/rssb.12167×1
- (2016). Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research, 26(7), 990–999. https://genome.cshlp.org/content/26/7/990 ↗ DOI: 10.1101/gr.200535.115×1
- (2016). A Diagram Is Worth A Dozen Images. European Conference on Computer Vision (ECCV 2016). https://arxiv.org/abs/1603.07396 ↗ DOI: 10.1007/978-3-319-46493-0_15×1
- (2016). Improving Variational Inference with Inverse Autoregressive Flow. Advances in Neural Information Processing Systems 29 (NeurIPS 2016). https://arxiv.org/abs/1606.04934 ↗×1
- (2016). End-to-End Training of Deep Visuomotor Policies. Journal of Machine Learning Research, 17(39), 1–40. https://arxiv.org/abs/1504.00702 ↗×2
- (2016). Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5), 947–1012. https://arxiv.org/abs/1501.01332 ↗ DOI: 10.1111/rssb.12167×1
- (2016). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. 4th International Conference on Learning Representations (ICLR 2016); arXiv:1511.06434. https://arxiv.org/abs/1511.06434 ↗×2
- (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), 1135–1144. https://arxiv.org/abs/1602.04938 ↗ DOI: 10.1145/2939672.2939778×1
- (2016). Controlling Bias in Adaptive Data Analysis Using Information Theory. https://proceedings.mlr.press/v51/russo16.html ↗×2
- (2016). Improved Techniques for Training GANs. Advances in Neural Information Processing Systems 29 (NeurIPS 2016). https://arxiv.org/abs/1606.03498 ↗×1
- ×1
- (2016). High-Dimensional Continuous Control Using Generalized Advantage Estimation. https://arxiv.org/abs/1506.02438 ↗×2
- (2016). Neural Machine Translation of Rare Words with Subword Units. https://arxiv.org/abs/1508.07909 ↗×2
- (2016). A Note on the Evaluation of Generative Models. 4th International Conference on Learning Representations (ICLR 2016). https://arxiv.org/abs/1511.01844 ↗×1
- (2016). Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning. Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), PMLR 48, 2139–2148. https://proceedings.mlr.press/v48/thomasa16.html ↗×1
- (2016). Pixel Recurrent Neural Networks. Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), PMLR 48, 1747–1756. https://arxiv.org/abs/1601.06759 ↗×4
- ×1
- (2016). Dueling Network Architectures for Deep Reinforcement Learning. https://arxiv.org/abs/1511.06581 ↗×1
- (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. https://arxiv.org/abs/1712.01815 ↗×1
- (2017). Wasserstein GAN. Proceedings of the 34th International Conference on Machine Learning (ICML 2017), PMLR 70, 214–223. https://arxiv.org/abs/1701.07875 ↗×2
- ×2
- ×1
- (2017). Solving the quantum many-body problem with artificial neural networks. Science, 355(6325), 602–606. https://arxiv.org/abs/1606.02318 ↗ DOI: 10.1126/science.aag2302×1
- (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047 ↗×2
- ×12
- (2017). Density estimation using Real NVP. 5th International Conference on Learning Representations (ICLR 2017). https://arxiv.org/abs/1605.08803 ↗×1
- (2017). Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data. https://arxiv.org/abs/1703.11008 ↗×5
- (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. https://arxiv.org/abs/1702.03118 ↗×1
- (2017). Neural Message Passing for Quantum Chemistry. Proceedings of the 34th International Conference on Machine Learning (ICML 2017). https://arxiv.org/abs/1704.01212 ↗×1
- (2017). Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. https://arxiv.org/abs/1706.02677 ↗×3
- (2017). Improved Training of Wasserstein GANs. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1704.00028 ↗×1
- ×1
- (2017). GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1706.08500 ↗×1
- (2017). beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. 5th International Conference on Learning Representations (ICLR 2017). https://openreview.net/forum?id=Sy2fzU9gl ↗×1
- (2017). Contextual Decision Processes with Low Bellman Rank are PAC-Learnable. https://arxiv.org/abs/1610.09512 ↗×1
- (2017). CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). https://arxiv.org/abs/1612.06890 ↗ DOI: 10.1109/CVPR.2017.215×1
- (2017). On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. https://arxiv.org/abs/1609.04836 ↗×2
- (2017). Inherent Trade-Offs in the Fair Determination of Risk Scores. Proceedings of the 8th Innovations in Theoretical Computer Science Conference (ITCS 2017), 67, 43:1–43:23. https://arxiv.org/abs/1609.05807 ↗ DOI: 10.4230/LIPIcs.ITCS.2017.43×3
- (2017). Counterfactual Fairness. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1703.06856 ↗×3
- ×2
- (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1705.07874 ↗×1
- ×1
- (2017). Mixed Precision Training. arXiv:1710.03740 (later published at ICLR 2018). https://arxiv.org/abs/1710.03740 ↗×3
- (2017). A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks. International Conference on Learning Representations (ICLR 2018); arXiv:1707.09564. https://arxiv.org/abs/1707.09564 ↗×2
- (2017). Feature Visualization. Distill, 2(11). https://distill.pub/2017/feature-visualization/ ↗ DOI: 10.23915/distill.00007×1
- (2017). Masked Autoregressive Flow for Density Estimation. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1705.07057 ↗×1
- (2017). Curiosity-driven Exploration by Self-supervised Prediction. Proceedings of the 34th International Conference on Machine Learning (ICML 2017), PMLR 70, 2778–2787. https://arxiv.org/abs/1705.05363 ↗×1
- (2017). PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). https://arxiv.org/abs/1612.00593 ↗ DOI: 10.1109/CVPR.2017.16×1
- ×1
- ×5
- (2017). SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1706.08566 ↗×1
- (2017). Estimating individual treatment effect: generalization bounds and algorithms. Proceedings of the 34th International Conference on Machine Learning (ICML 2017), PMLR 70, 3076–3085. https://arxiv.org/abs/1606.03976 ↗×1
- (2017). Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. https://arxiv.org/abs/1701.06538 ↗×2
- (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815. https://arxiv.org/abs/1712.01815 ↗×1
- (2017). Axiomatic Attribution for Deep Networks. Proceedings of the 34th International Conference on Machine Learning (ICML 2017), PMLR 70, 3319–3328. https://arxiv.org/abs/1703.01365 ↗×3
- (2017). Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017). https://arxiv.org/abs/1703.06907 ↗ DOI: 10.1109/IROS.2017.8202133×3
- ×1
- (2017). Neural Discrete Representation Learning. Advances in Neural Information Processing Systems 30 (NeurIPS 2017). https://arxiv.org/abs/1711.00937 ↗×1
- ×8
- (2017). NVIDIA Tesla V100 GPU Architecture: The World's Most Advanced Data Center GPU. NVIDIA Whitepaper WP-08608-001_v1.1, August 2017. https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf ↗×1
- (2017). Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology, 31(2), 841–887. https://arxiv.org/abs/1711.00399 ↗ DOI: 10.2139/ssrn.3063289×2
- (2017). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://arxiv.org/abs/1510.04342 ↗ DOI: 10.1080/01621459.2017.1319839×1
- (2017). Information-theoretic analysis of generalization capability of learning algorithms. https://arxiv.org/abs/1705.07809 ↗×2
- (2017). Understanding deep learning requires rethinking generalization. https://arxiv.org/abs/1611.03530 ↗×3
- (2018). Sanity Checks for Saliency Maps. Advances in Neural Information Processing Systems 31 (NeurIPS 2018). https://arxiv.org/abs/1810.03292 ↗×2
- (2018). Stronger Generalization Bounds for Deep Nets via a Compression Approach. https://arxiv.org/abs/1802.05296 ↗×1
- (2018). Horovod: fast and easy distributed deep learning in TensorFlow. arXiv:1802.05799. https://arxiv.org/abs/1802.05799 ↗×1
- (2018). A Note on the Inception Score. arXiv:1801.01973 (ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models). https://arxiv.org/abs/1801.01973 ↗×1
- (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68. https://onlinelibrary.wiley.com/doi/abs/10.1111/ectj.12097 ↗×2
- ×2
- (2018). Supervising Strong Learners by Amplifying Weak Experts. arXiv:1810.08575. https://arxiv.org/abs/1810.08575 ↗×3
- (2018). What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Volume 1: Long Papers, 2126–2136. https://arxiv.org/abs/1805.01070 ↗ DOI: 10.18653/v1/P18-1198×2
- (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805 ↗×7
- (2018). The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. https://arxiv.org/abs/1803.03635 ↗×1
- (2018). Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs. https://arxiv.org/abs/1802.10026 ↗×1
- (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. https://arxiv.org/abs/1804.07461 ↗ DOI: 10.18653/v1/W18-5446×1
- (2018). Improving Language Understanding by Generative Pre-Training. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf ↗×1
- ×2
- (2018). Rainbow: Combining Improvements in Deep Reinforcement Learning. https://arxiv.org/abs/1710.02298 ↗×2
- (2018). Universal Language Model Fine-tuning for Text Classification. https://arxiv.org/abs/1801.06146 ↗×1
- ×3
- (2018). Neural Tangent Kernel: Convergence and Generalization in Neural Networks. Advances in Neural Information Processing Systems 31 (NeurIPS 2018). https://arxiv.org/abs/1806.07572 ↗×3
- (2018). Junction Tree Variational Autoencoder for Molecular Graph Generation. Proceedings of the 35th International Conference on Machine Learning (ICML 2018), PMLR 80, 2323–2332. https://arxiv.org/abs/1802.04364 ↗×1
- (2018). Progressive Growing of GANs for Improved Quality, Stability, and Variation. International Conference on Learning Representations (ICLR 2018). https://arxiv.org/abs/1710.10196 ↗×2
- (2018). Glow: Generative Flow with Invertible 1x1 Convolutions. Advances in Neural Information Processing Systems 31 (NeurIPS 2018). https://arxiv.org/abs/1807.03039 ↗×1
- (2018). Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. https://arxiv.org/abs/1804.10959 ↗×4
- (2018). Learning Plannable Representations with Causal InfoGAN. Advances in Neural Information Processing Systems 31 (NeurIPS 2018). https://arxiv.org/abs/1807.09341 ↗×1
- (2018). Scalable agent alignment via reward modeling: a research direction. arXiv:1811.07871. https://arxiv.org/abs/1811.07871 ↗ DOI: 10.48550/arXiv.1811.07871×2
- (2018). Categorizing Variants of Goodhart's Law. arXiv:1803.04585. https://arxiv.org/abs/1803.04585 ↗×2
- (2018).×1
not_found: details
'NLP' is a field abbreviation, not an author; '(2018)' here is a year reference to BERT's release, not a formal author-year citation to a specific paper
- (2018). The Building Blocks of Interpretability. Distill, 3(3), e10. https://distill.pub/2018/building-blocks/ ↗ DOI: 10.23915/distill.00010×1
- (2018). Representation Learning with Contrastive Predictive Coding. https://arxiv.org/abs/1807.03748 ↗×4
- ×1
- (2018). Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution. https://arxiv.org/abs/1801.04016 ↗×2
- ×2
- (2018). Improving Language Understanding by Generative Pre-Training. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf ↗×6
- (2018). search Scholar ↗×1
- (2018). search Scholar ↗×2
- ×2
- (2018). search Scholar ↗×2
- (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://arxiv.org/abs/1510.04342 ↗ DOI: 10.1080/01621459.2017.1319839×2
- (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. https://arxiv.org/abs/1804.07461 ↗ DOI: 10.18653/v1/W18-5446×1
- (2018). MoleculeNet: A Benchmark for Molecular Machine Learning. Chemical Science, 9(2), 513–530. https://arxiv.org/abs/1703.00564 ↗ DOI: 10.1039/C7SC02664A×1
- (2018b). Soft Actor-Critic Algorithms and Applications. arXiv:1812.05905. https://arxiv.org/abs/1812.05905 ↗×1
- ×3
- ×1
- (2019). Reconciling modern machine-learning practice and the classical bias–variance trade-off. https://www.pnas.org/doi/10.1073/pnas.1903070116 ↗×3
- (2019). Pros and Cons of GAN Evaluation Measures. Computer Vision and Image Understanding, 179, 41–65. https://arxiv.org/abs/1802.03446 ↗ DOI: 10.1016/j.cviu.2018.10.009×1
- (2019). Large Scale GAN Training for High Fidelity Natural Image Synthesis. International Conference on Learning Representations (ICLR 2019). https://arxiv.org/abs/1809.11096 ↗×2
- ×3
- (2019). GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1811.06965 ↗×2
- (2019). Dream to Control: Learning Behaviors by Latent Imagination. arXiv:1912.01603 (later ICLR 2020). https://arxiv.org/abs/1912.01603 ↗×1
- (2019). Designing and Interpreting Probes with Control Tasks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019). https://arxiv.org/abs/1909.03368 ↗ DOI: 10.18653/v1/D19-1275×1
- ×3
- (2019). GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1811.06965 ↗×1
- (2019). Risks from Learned Optimization in Advanced Machine Learning Systems. arXiv:1906.01820. https://arxiv.org/abs/1906.01820 ↗×1
- (2019). GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). https://arxiv.org/abs/1902.09506 ↗×1
- (2019). Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), PMLR 89, 859–868. https://arxiv.org/abs/1805.08651 ↗×2
- (2019). Attention is not Explanation. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019). https://arxiv.org/abs/1902.10186 ↗ DOI: 10.18653/v1/N19-1357×2
- (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), 4401–4410. https://arxiv.org/abs/1812.04948 ↗ DOI: 10.1109/CVPR.2019.00453×2
- (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10), 4156–4165. https://www.pnas.org/doi/10.1073/pnas.1804597116 ↗×1
- (2019). search Scholar ↗×1
- (2019). search Scholar ↗×1
- (2019). search Scholar ↗×3
- (2019). ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1908.02265 ↗×1
- (2019). Rebooting AI: Building Artificial Intelligence We Can Trust. https://www.penguinrandomhouse.com/books/603982/rebooting-ai-by-gary-marcus-and-ernest-davis/ ↗×1
- (2019). Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv:1909.08053. https://arxiv.org/abs/1909.08053 ↗×1
- (2019). The generalization error of random features regression: Precise asymptotics and double descent curve. https://arxiv.org/abs/1908.05355 ↗×1
- (2019). Uniform convergence may be unable to explain generalization in deep learning. https://arxiv.org/abs/1902.04742 ↗×6
- (2019). Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates. https://arxiv.org/abs/1911.02151 ↗×2
- (2019). Computational Optimal Transport. Foundations and Trends in Machine Learning, 11(5–6), 355–607. https://arxiv.org/abs/1803.00567 ↗ DOI: 10.1561/2200000073×1
- (2019). Language Models are Unsupervised Multitask Learners. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf ↗×5
- (2019). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. https://arxiv.org/abs/1910.10683 ↗×4
- (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. https://www.sciencedirect.com/science/article/abs/pii/S0021999118307125 ↗ DOI: 10.1016/j.jcp.2018.10.045×1
- (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019). https://arxiv.org/abs/1908.10084 ↗ DOI: 10.18653/v1/D19-1410×1
- (2019). A Theoretical Analysis of Contrastive Unsupervised Representation Learning. https://arxiv.org/abs/1902.09229 ↗×1
- (2019). Learning Retrosynthetic Planning through Simulated Experience. ACS Central Science, 5(6), 970–981. https://pubs.acs.org/doi/10.1021/acscentsci.9b00055 ↗×1
- (2019). Adapting Neural Networks for the Estimation of Treatment Effects. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1906.02120 ↗×1
- (2019). Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv:1909.08053. https://arxiv.org/abs/1909.08053 ↗×1
- (2019). Generative Modeling by Estimating Gradients of the Data Distribution. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1907.05600 ↗×2
- (2019). SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). https://arxiv.org/abs/1905.00537 ↗×1
- (2019). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. https://arxiv.org/abs/1905.11946 ↗×2
- ×5
- (2020). Quantifying Attention Flow in Transformers. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 4190–4197. https://aclanthology.org/2020.acl-main.385/ ↗ DOI: 10.18653/v1/2020.acl-main.385×2
- (2020). NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Acceleration at Every Scale. NVIDIA Whitepaper, v1.0. https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf ↗×1
- ×2
- (2020). Two Models of Double Descent for Weak Features. https://epubs.siam.org/doi/10.1137/20M1336072 ↗×1
- ×1
- ×7
- (2020). Tightening Mutual Information Based Bounds on Generalization Error. https://arxiv.org/abs/1901.04609 ↗×2
- (2020). A mobile robotic chemist. Nature, 583(7815), 237–241. https://www.nature.com/articles/s41586-020-2442-2 ↗ DOI: 10.1038/s41586-020-2442-2×1
- (2020). Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. https://arxiv.org/abs/2006.09882 ↗×3
- (2020). A Simple Framework for Contrastive Learning of Visual Representations. https://arxiv.org/abs/2002.05709 ↗×4
- (2020). ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. arXiv:2010.09885. https://arxiv.org/abs/2010.09885 ↗×1
- (2020). Lagrangian Neural Networks. arXiv:2003.04630 (ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations). https://arxiv.org/abs/2003.04630 ↗×1
- (2020). ZeRO: Memory Optimizations Toward Training Trillion Parameter Models. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '20). https://arxiv.org/abs/1910.02054 ↗ DOI: 10.1109/SC41405.2020.00024×1
- (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. https://arxiv.org/abs/2010.11929 ↗×4
- (2020). Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm. npj Computational Materials, 6, 138. https://arxiv.org/abs/2005.00707 ↗ DOI: 10.1038/s41524-020-00406-3×1
- (2020). GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), 11444–11453. https://openaccess.thecvf.com/content_CVPR_2020/html/Fang_GraspNet-1Billion_A_Large-Scale_Benchmark_for_General_Object_Grasping_CVPR_2020_paper.html ↗ DOI: 10.1109/CVPR42600.2020.01146×1
- (2020). search Scholar ↗×1
- (2020). The Pile: An 800GB Dataset of Diverse Text for Language Modeling. https://arxiv.org/abs/2101.00027 ↗×2
- (2020). search Scholar ↗×2
- (2020). search Scholar ↗×1
- (2020). Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. https://arxiv.org/abs/2006.07733 ↗×3
- (2020). HiPPO: Recurrent Memory with Optimal Polynomial Projections. https://arxiv.org/abs/2008.07669 ↗×2
- (2020). search Scholar ↗×2
- (2020). Momentum Contrast for Unsupervised Visual Representation Learning. https://arxiv.org/abs/1911.05722 ↗×3
- (2020). Measuring Massive Multitask Language Understanding. International Conference on Learning Representations (ICLR 2021). https://arxiv.org/abs/2009.03300 ↗×1
- (2020). Deep-neural-network solution of the electronic Schrödinger equation. Nature Chemistry, 12(10), 891–897. https://arxiv.org/abs/1909.08423 ↗ DOI: 10.1038/s41557-020-0544-y×1
- (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/ ↗×2
- (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems 33 (NeurIPS 2020). https://arxiv.org/abs/2006.11239 ↗×1
- ×2
- (2020). Provably Efficient Reinforcement Learning with Linear Function Approximation. https://arxiv.org/abs/1907.05388 ↗×1
- (2020).×1
not_found: details
'June' is not an author surname; the parenthetical 'June 2020' in the context denotes GPT-3's release month, not an author-year citation
- ×13
- (2020). Dense Passage Retrieval for Open-Domain Question Answering. https://arxiv.org/abs/2004.04906 ↗ DOI: 10.18653/v1/2020.emnlp-main.550×1
- (2020). ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020). https://arxiv.org/abs/2004.12832 ↗ DOI: 10.1145/3397271.3401075×2
- (2020). Variational Autoencoders and Nonlinear ICA: A Unifying Framework. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), PMLR 108. https://arxiv.org/abs/1907.04809 ↗×1
- (2020). Specification gaming: the flip side of AI ingenuity. DeepMind Blog, 21 April 2020. https://deepmind.google/blog/specification-gaming-the-flip-side-of-ai-ingenuity/ ↗×1
- (2020). Conservative Q-Learning for Offline Reinforcement Learning. Advances in Neural Information Processing Systems 33 (NeurIPS 2020). https://arxiv.org/abs/2006.04779 ↗×2
- (2020). Learning Quadrupedal Locomotion over Challenging Terrain. Science Robotics, 5(47), eabc5986. https://arxiv.org/abs/2010.11251 ↗ DOI: 10.1126/scirobotics.abc5986×4
- (2020). GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. arXiv:2006.16668. https://arxiv.org/abs/2006.16668 ↗×1
- (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems 33 (NeurIPS 2020). https://arxiv.org/abs/2005.11401 ↗×1
- (2020). The Open Catalyst 2020 (OC20) Dataset and Community Challenges. ACS Catalysis, 11(10), 6059–6072 (arXiv:2010.09990, 2020). https://arxiv.org/abs/2010.09990 ↗ DOI: 10.1021/acscatal.0c04525×1
- (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. European Conference on Computer Vision (ECCV 2020). https://arxiv.org/abs/2003.08934 ↗ DOI: 10.1007/978-3-030-58452-8_24×2
- (2020). Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. https://arxiv.org/abs/1911.08265 ↗×1
- (2020). AWAC: Accelerating Online Reinforcement Learning with Offline Datasets. arXiv:2006.09359. https://arxiv.org/abs/2006.09359 ↗×1
- (2020). Deep Double Descent: Where Bigger Models and More Data Hurt. https://arxiv.org/abs/1912.02292 ↗×2
- (2020).×1
not_found: details
'November 2020' is a date reference to the CASP 14 event, not a bibliographic citation; no paper to verify
- (2020). Zoom In: An Introduction to Circuits. Distill, 5(3), e00024.001. https://distill.pub/2020/circuits/zoom-in/ ↗ DOI: 10.23915/distill.00024.001×1
- (2020). Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Physical Review Research, 2(3), 033429. https://arxiv.org/abs/1909.02487 ↗ DOI: 10.1103/PhysRevResearch.2.033429×1
- (2020). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21(140), 1–67. https://arxiv.org/abs/1910.10683 ↗×1
- (2020). ZeRO: Memory Optimizations Toward Training Trillion Parameter Models. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '20). https://arxiv.org/abs/1910.02054 ↗ DOI: 10.1109/SC41405.2020.00024×1
- (2020). WeatherBench: A Benchmark Data Set for Data-Driven Weather Forecasting. Journal of Advances in Modeling Earth Systems, 12(11), e2020MS002203. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2020MS002203 ↗×1
- (2020). Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. https://arxiv.org/abs/1911.08265 ↗×2
- (2020). Improved protein structure prediction using potentials from deep learning. Nature, 577(7792), 706–710. https://www.nature.com/articles/s41586-019-1923-7 ↗ DOI: 10.1038/s41586-019-1923-7×2
- ×5
- (2020). Improved Techniques for Training Score-Based Generative Models. Advances in Neural Information Processing Systems 33 (NeurIPS 2020). https://arxiv.org/abs/2006.09011 ↗×2
- (2020). Reasoning About Generalization via Conditional Mutual Information. https://arxiv.org/abs/2001.09122 ↗×2
- ×4
- (2020). On Mutual Information Maximization for Representation Learning. https://arxiv.org/abs/1907.13625 ↗×5
- (2020). AI Feynman: A physics-inspired method for symbolic regression. Science Advances, 6(16), eaay2631. https://arxiv.org/abs/1905.11481 ↗ DOI: 10.1126/sciadv.aay2631×1
- (2020). Investigating Gender Bias in Language Models Using Causal Mediation Analysis. Advances in Neural Information Processing Systems 33 (NeurIPS 2020). https://arxiv.org/abs/2004.12265 ↗×1
- ×1
- (2020). Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound. https://arxiv.org/abs/1905.10389 ↗×1
- (2020). search Scholar ↗×1
- (2021). VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text. https://arxiv.org/abs/2104.11178 ↗×1
- (2021). search Scholar ↗×1
- (2021). search Scholar ↗×1
- (2021). search Scholar ↗×1
- (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. https://dl.acm.org/doi/10.1145/3442188.3445922 ↗×5
- ×5
- (2021). Choose a Transformer: Fourier or Galerkin. Advances in Neural Information Processing Systems 34 (NeurIPS 2021). https://arxiv.org/abs/2105.14995 ↗×1
- (2021). Emerging Properties in Self-Supervised Vision Transformers. https://arxiv.org/abs/2104.14294 ↗×4
- (2021). Open Catalyst 2020 (OC20) Dataset and Community Challenges. ACS Catalysis, 11(10), 6059–6072. https://arxiv.org/abs/2010.09990 ↗ DOI: 10.1021/acscatal.0c04525×1
- (2021). Evaluating Large Language Models Trained on Code. arXiv:2107.03374. https://arxiv.org/abs/2107.03374 ↗×2
- (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning (ICML 2021). https://arxiv.org/abs/2103.00020 ↗×1
- (2021). Training Verifiers to Solve Math Word Problems. arXiv:2110.14168. https://arxiv.org/abs/2110.14168 ↗×4
- (2021). Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability. https://arxiv.org/abs/2103.00065 ↗×1
- (2021). The case for aligning narrowly superhuman models. AI Alignment Forum (blog post). https://www.alignmentforum.org/posts/PZtsoaoSLpKjjbMqM/the-case-for-aligning-narrowly-superhuman-models ↗×2
- (2021). search Scholar ↗×2
- (2021). search Scholar ↗×1
- (2021). Bilinear Classes: A Structural Framework for Provable Generalization in RL. https://arxiv.org/abs/2103.10897 ↗×1
- (2021). A Mathematical Framework for Transformer Circuits. https://transformer-circuits.pub/2021/framework/index.html ↗×4
- (2021). search Scholar ↗×2
- (2021). search Scholar ↗×1
- (2021). Transformer Feed-Forward Layers Are Key-Value Memories. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 5484–5495. https://arxiv.org/abs/2012.14913 ↗ DOI: 10.18653/v1/2021.emnlp-main.446×1
- (2021). Multimodal Neurons in Artificial Neural Networks. Distill, 6(3). https://distill.pub/2021/multimodal-neurons/ ↗ DOI: 10.23915/distill.00030×3
- (2021). Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision. Proceedings of the 38th International Conference on Machine Learning (ICML 2021); arXiv:2102.05918. https://arxiv.org/abs/2102.05918 ↗×2
- (2021). Efficiently Modeling Long Sequences with Structured State Spaces. https://arxiv.org/abs/2111.00396 ↗×2
- (2021). Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss. https://arxiv.org/abs/2106.04156 ↗×2
- (2021). Measuring Mathematical Problem Solving With the MATH Dataset. Advances in Neural Information Processing Systems 34, Datasets and Benchmarks Track (NeurIPS 2021). https://arxiv.org/abs/2103.03874 ↗×7
- ×1
- (2021). Classifier-Free Diffusion Guidance. NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications (later arXiv:2207.12598). https://arxiv.org/abs/2207.12598 ↗ DOI: 10.48550/arXiv.2207.12598×2
- ×3
- (2021). search Scholar ↗×1
- (2021). Unsupervised Dense Information Retrieval with Contrastive Learning. arXiv:2112.09118. https://arxiv.org/abs/2112.09118 ↗×2
- (2021). Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision. https://arxiv.org/abs/2102.05918 ↗×2
- (2021). Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms. https://arxiv.org/abs/2102.00815 ↗×3
- (2021). Highly accurate protein structure prediction with AlphaFold. https://doi.org/10.1038/s41586-021-03819-2 ↗×8
- (2021). Algorithmic Recourse: from Counterfactual Explanations to Interventions. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). https://dl.acm.org/doi/10.1145/3442188.3445899 ↗×2
- (2021). A Distributional Approach to Controlled Text Generation. International Conference on Learning Representations (ICLR 2021). https://arxiv.org/abs/2012.11635 ↗×1
- (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. https://arxiv.org/abs/2104.08691 ↗×1
- (2021). Prefix-Tuning: Optimizing Continuous Prompts for Generation. https://arxiv.org/abs/2101.00190 ↗×3
- (2021). P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. arXiv:2110.07602. https://arxiv.org/abs/2110.07602 ↗×1
- (2021). Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3), 218–229. https://www.nature.com/articles/s42256-021-00302-5 ↗ DOI: 10.1038/s42256-021-00302-5×1
- (2021). DocVQA: A Dataset for VQA on Document Images. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021, 2200–2209. https://arxiv.org/abs/2007.00398 ↗ DOI: 10.1109/WACV48630.2021.00225×1
- (2021). Measuring Massive Multitask Language Understanding. International Conference on Learning Representations (ICLR 2021). https://arxiv.org/abs/2009.03300 ↗×1
- (2021). WebGPT: Browser-assisted question-answering with human feedback. arXiv:2112.09332. https://arxiv.org/abs/2112.09332 ↗×1
- (2021). Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). https://arxiv.org/abs/2104.04473 ↗ DOI: 10.1145/3458817.3476209×1
- (2021). GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv:2112.10741. https://arxiv.org/abs/2112.10741 ↗×1
- (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139, 8748–8763. https://arxiv.org/abs/2103.00020 ↗×2
- (2021). Carbon Emissions and Large Neural Network Training. arXiv:2104.10350. https://arxiv.org/abs/2104.10350 ↗×1
- (2021). Tighter Risk Certificates for Neural Networks. Journal of Machine Learning Research, 22(227), 1–40. https://www.jmlr.org/papers/v22/20-879.html ↗×1
- (2021). Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation. https://arxiv.org/abs/2108.12409 ↗×3
- (2021). Learning Transferable Visual Models From Natural Language Supervision. https://arxiv.org/abs/2103.00020 ↗×11
- (2021). ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). https://arxiv.org/abs/2104.07857 ↗ DOI: 10.1145/3458817.3476205×1
- (2021). Zero-Shot Text-to-Image Generation. Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139, 8821–8831. https://arxiv.org/abs/2102.12092 ↗×1
- (2021). The Risks of Invariant Risk Minimization. International Conference on Learning Representations (ICLR 2021). https://arxiv.org/abs/2010.05761 ↗×2
- (2021). Toward Causal Representation Learning. Proceedings of the IEEE, 109(5), 612–634. https://ieeexplore.ieee.org/document/9363924/ ↗ DOI: 10.1109/JPROC.2021.3058954×2
- (2021). Denoising Diffusion Implicit Models. 9th International Conference on Learning Representations (ICLR 2021). https://arxiv.org/abs/2010.02502 ↗×4
- (2021). RoFormer: Enhanced Transformer with Rotary Position Embedding. https://arxiv.org/abs/2104.09864 ↗×5
- (2021). Understanding self-supervised Learning Dynamics without Contrastive Pairs. Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139, 10268–10278. https://arxiv.org/abs/2102.06810 ↗×1
- (2021). search Scholar ↗×1
- ×1
- ×2
- (2021). search Scholar ↗×1
- (2021). search Scholar ↗×1
- (2022). What learning algorithm is in-context learning? Investigations with linear models. https://arxiv.org/abs/2211.15661 ↗×1
- ×3
- (2022). search Scholar ↗×12
- (2022). search Scholar ↗×1
- ×5
- (2022). MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. Advances in Neural Information Processing Systems 35 (NeurIPS 2022). https://arxiv.org/abs/2206.07697 ↗×2
- (2022). E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications, 13(1), 2453. https://www.nature.com/articles/s41467-022-29939-5 ↗ DOI: 10.1038/s41467-022-29939-5×2
- (2022). Better speech synthesis through scaling. arXiv:2305.07243. https://arxiv.org/abs/2305.07243 ↗×1
- (2022). Discovering Latent Knowledge in Language Models Without Supervision. arXiv:2212.03827. https://arxiv.org/abs/2212.03827 ↗×6
- (2022). Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks. https://arxiv.org/abs/2211.12588 ↗×3
- (2022). Training Compute-Optimal Large Language Models. Advances in Neural Information Processing Systems 35 (NeurIPS 2022). https://arxiv.org/abs/2203.15556 ↗×2
- (2022). Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation. https://aclanthology.org/2022.tacl-1.5/ ↗×1
- (2022). search Scholar ↗×1
- (2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. https://arxiv.org/abs/2205.14135 ↗×4
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×2
- (2022). Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. https://arxiv.org/abs/2101.03961 ↗×3
- (2022). Flamingo: a Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems 35 (NeurIPS 2022). https://arxiv.org/abs/2204.14198 ↗×1
- (2022). GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. arXiv:2210.17323. https://arxiv.org/abs/2210.17323 ↗×2
- (2022). CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models. Proceedings of the Thirteenth Language Resources and Evaluation Conference (LREC 2022), 2126–2140. https://aclanthology.org/2022.lrec-1.229/ ↗×1
- (2022). Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned. arXiv:2209.07858. https://arxiv.org/abs/2209.07858 ↗×1
- (2022). What Can Transformers Learn In-Context? A Case Study of Simple Function Classes. https://arxiv.org/abs/2208.01066 ↗×1
- (2022). search Scholar ↗×2
- (2022). search Scholar ↗×3
- (2022). search Scholar ↗×2
- ×3
- (2022). search Scholar ↗×1
- ×12
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). OpenCLIP. Zenodo (Software). https://zenodo.org/records/6496083 ↗ DOI: 10.5281/zenodo.5143773×1
- (2022). Chemformer: a pre-trained transformer for computational chemistry. Machine Learning: Science and Technology, 3(1), 015022. https://iopscience.iop.org/article/10.1088/2632-2153/ac3ffb ↗×1
- ×4
- (2022). Offline Reinforcement Learning with Implicit Q-Learning. International Conference on Learning Representations (ICLR 2022). https://arxiv.org/abs/2110.06169 ↗×2
- (2022). Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation. arXiv:2211.06687. https://arxiv.org/abs/2211.06687 ↗×1
- (2022). A Path Towards Autonomous Machine Intelligence. https://openreview.net/forum?id=BZ5a1r-kVsf ↗×2
- (2022). Deduplicating Training Data Makes Language Models Better. https://arxiv.org/abs/2107.06499 ↗×4
- (2022). search Scholar ↗×1
- (2022). BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. https://arxiv.org/abs/2201.12086 ↗×3
- ×3
- (2022). search Scholar ↗×4
- (2022). search Scholar ↗×2
- (2022). search Scholar ↗×1
- ×7
- (2022). PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization. https://arxiv.org/abs/2211.13609 ↗×2
- (2022). Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. https://arxiv.org/abs/2104.08786 ↗×3
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks. IEEE Robotics and Automation Letters, 7(3), 7327–7334. https://arxiv.org/abs/2112.03227 ↗ DOI: 10.1109/LRA.2022.3180108×1
- (2022). Locating and Editing Factual Associations in GPT. Advances in Neural Information Processing Systems 35 (NeurIPS 2022). https://arxiv.org/abs/2202.05262 ↗×4
- (2022). OPT: Open Pre-trained Transformer Language Models. arXiv:2205.01068. https://arxiv.org/abs/2205.01068 ↗ DOI: 10.48550/arXiv.2205.01068×1
- (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. https://arxiv.org/abs/2202.12837 ↗×2
- (2022). Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior. Proceedings of the 6th Conference on Robot Learning (CoRL 2022). https://arxiv.org/abs/2212.03238 ↗×2
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×2
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×2
- (2022). In-context Learning and Induction Heads. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html ↗×6
- (2022). search Scholar ↗×2
- (2022). Training language models to follow instructions with human feedback. https://arxiv.org/abs/2203.02155 ↗×8
- (2022). search Scholar ↗×1
- (2022). FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators. arXiv:2202.11214. https://arxiv.org/abs/2202.11214 ↗×2
- (2022). The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink. Computer, 55(7), 18–28. https://arxiv.org/abs/2204.05149 ↗ DOI: 10.1109/MC.2022.3148714×1
- (2022). Red Teaming Language Models with Language Models. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), 3419–3448. https://arxiv.org/abs/2202.03286 ↗ DOI: 10.18653/v1/2022.emnlp-main.225×5
- (2022). DreamFusion: Text-to-3D using 2D Diffusion. arXiv:2209.14988. https://arxiv.org/abs/2209.14988 ↗×1
- (2022). Robust Speech Recognition via Large-Scale Weak Supervision. https://arxiv.org/abs/2212.04356 ↗×4
- (2022). Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv:2204.06125. https://arxiv.org/abs/2204.06125 ↗ DOI: 10.48550/arXiv.2204.06125×1
- (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 (later published at ICLR 2023). https://arxiv.org/abs/2210.03629 ↗×1
- ×2
- (2022). search Scholar ↗×3
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- (2022). Understanding Contrastive Learning Requires Incorporating Inductive Biases. https://arxiv.org/abs/2202.14037 ↗×2
- (2022). Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models. arXiv:2206.04615. https://arxiv.org/abs/2206.04615 ↗ DOI: 10.48550/arXiv.2206.04615×2
- (2022). Extracting Latent Steering Vectors from Pretrained Language Models. Findings of the Association for Computational Linguistics: ACL 2022. https://aclanthology.org/2022.findings-acl.48/ ↗ DOI: 10.18653/v1/2022.findings-acl.48×3
- (2022). Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. arXiv:2210.09261. https://arxiv.org/abs/2210.09261 ↗ DOI: 10.48550/arXiv.2210.09261×1
- (2022). Solving math word problems with process- and outcome-based feedback. arXiv:2211.14275. https://arxiv.org/abs/2211.14275 ↗×1
- (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. https://arxiv.org/abs/2203.11171 ↗×14
- ×10
- (2022). DayDreamer: World Models for Physical Robot Learning. arXiv:2206.14176 (also Proceedings of the 6th Conference on Robot Learning, PMLR 205, 2023). https://arxiv.org/abs/2206.14176 ↗×1
- (2022). search Scholar ↗×1
- (2022). An Explanation of In-context Learning as Implicit Bayesian Inference. https://arxiv.org/abs/2111.02080 ↗×1
- (2022). ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models. https://arxiv.org/abs/2105.13626 ↗×1
- (2022). search Scholar ↗×4
- (2022). search Scholar ↗×1
- (2022). search Scholar ↗×1
- ×3
- (2023). GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. https://arxiv.org/abs/2305.13245 ↗×4
- (2023). search Scholar ↗×3
- (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. Advances in Neural Information Processing Systems 36 (NeurIPS 2023), Datasets and Benchmarks Track. https://arxiv.org/abs/2306.05685 ↗×1
- (2023). Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. arXiv:2310.11511 (later published at ICLR 2024). https://arxiv.org/abs/2310.11511 ↗×1
- (2023). Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture. https://arxiv.org/abs/2301.08243 ↗×2
- (2023). A General Theoretical Paradigm to Understand Learning from Human Preferences. https://arxiv.org/abs/2310.12036 ↗×1
- (2023). Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond. arXiv:2308.12966. https://arxiv.org/abs/2308.12966 ↗×1
- (2023). An autonomous laboratory for the accelerated synthesis of novel materials. Nature, 624(7990), 86–91. https://www.nature.com/articles/s41586-023-06734-w ↗ DOI: 10.1038/s41586-023-06734-w×1
- (2023). search Scholar ↗×2
- (2023). Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling. https://arxiv.org/abs/2304.01373 ↗×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). InstructPix2Pix: Learning to Follow Image Editing Instructions. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023). https://arxiv.org/abs/2211.09800 ↗ DOI: 10.1109/CVPR52729.2023.01764×1
- (2023). Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision. arXiv:2312.09390. https://arxiv.org/abs/2312.09390 ↗×1
- ×4
- (2023). Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Transactions on Machine Learning Research (TMLR), 2023; arXiv:2307.15217. https://arxiv.org/abs/2307.15217 ↗×1
- ×3
- (2023). Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science, 381(6664), eadg7492. https://www.science.org/doi/10.1126/science.adg7492 ↗×2
- (2023). Diffusion Policy: Visuomotor Policy Learning via Action Diffusion. Proceedings of Robotics: Science and Systems (RSS 2023). https://arxiv.org/abs/2303.04137 ↗ DOI: 10.15607/RSS.2023.XIX.026×2
- (2023). Open X-Embodiment: Robotic Learning Datasets and RT-X Models. 2024 IEEE International Conference on Robotics and Automation (ICRA 2024); arXiv:2310.08864. https://arxiv.org/abs/2310.08864 ↗ DOI: 10.48550/arXiv.2310.08864×2
- (2023). XTTS-v2 (model release). Hugging Face model repository (coqui/XTTS-v2), released November 2023. https://huggingface.co/coqui/XTTS-v2 ↗×1
- (2023). Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl. arXiv:2305.01582. https://arxiv.org/abs/2305.01582 ↗×1
- (2023). scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI. bioRxiv 2023.04.30.538439 (later published in Nature Methods, 21(8), 1470–1480, 2024). https://www.biorxiv.org/content/10.1101/2023.04.30.538439v2 ↗×1
- (2023). Sparse Autoencoders Find Highly Interpretable Features in Language Models. arXiv:2309.08600. https://arxiv.org/abs/2309.08600 ↗×2
- (2023). FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. https://arxiv.org/abs/2307.08691 ↗×4
- (2023). Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805. https://arxiv.org/abs/2312.11805 ↗×2
- (2023). Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805. https://arxiv.org/abs/2312.11805 ↗×2
- (2023). QLoRA: Efficient Finetuning of Quantized LLMs. Advances in Neural Information Processing Systems 36 (NeurIPS 2023). https://arxiv.org/abs/2305.14314 ↗×2
- (2023). Chain-of-Verification Reduces Hallucination in Large Language Models. arXiv:2309.11495. https://arxiv.org/abs/2309.11495 ↗×2
- (2023). PaLM-E: An Embodied Multimodal Language Model. Proceedings of the 40th International Conference on Machine Learning (ICML 2023), PMLR 202. https://arxiv.org/abs/2303.03378 ↗×1
- (2023). Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv:2305.14325. https://arxiv.org/abs/2305.14325 ↗×2
- (2023). Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization. https://arxiv.org/abs/2303.01462 ↗×1
- (2023). MegaBlocks: Efficient Sparse Training with Mixture-of-Experts. Proceedings of Machine Learning and Systems 5 (MLSys 2023). https://arxiv.org/abs/2211.15841 ↗×2
- (2023). Scaling Laws for Reward Model Overoptimization. Proceedings of the 40th International Conference on Machine Learning (ICML 2023), PMLR 202, 10835–10866. https://arxiv.org/abs/2210.10760 ↗×2
- (2023). Localizing Model Behavior with Path Patching. arXiv:2304.05969. https://arxiv.org/abs/2304.05969 ↗×1
- (2023). Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations. arXiv:2301.04246. https://arxiv.org/abs/2301.04246 ↗×2
- ×3
- (2023). search Scholar ↗×1
- (2023). Mamba: Linear-Time Sequence Modeling with Selective State Spaces. https://arxiv.org/abs/2312.00752 ↗×4
- (2023). search Scholar ↗×2
- ×2
- ×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×1
- ×3
- (2023). Efficient Memory Management for Large Language Model Serving with PagedAttention. https://arxiv.org/abs/2309.06180 ↗×5
- (2023). search Scholar ↗×2
- (2023). Measuring Faithfulness in Chain-of-Thought Reasoning. arXiv:2307.13702. https://arxiv.org/abs/2307.13702 ↗×4
- (2023). Fast Inference from Transformers via Speculative Decoding. https://arxiv.org/abs/2211.17192 ↗×3
- (2023). BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. Proceedings of the 40th International Conference on Machine Learning (ICML 2023). https://arxiv.org/abs/2301.12597 ↗×6
- (2023). EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations. arXiv:2306.12059. https://arxiv.org/abs/2306.12059 ↗×1
- ×5
- (2023). AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration. Proceedings of Machine Learning and Systems 6 (MLSys 2024). https://arxiv.org/abs/2306.00978 ↗×2
- (2023). Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training. https://arxiv.org/abs/2305.14342 ↗×7
- (2023). search Scholar ↗×3
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×1
- (2023). Mass-Editing Memory in a Transformer. International Conference on Learning Representations (ICLR 2023). https://arxiv.org/abs/2210.07229 ↗×2
- (2023). Scaling deep learning for materials discovery. Nature, 624(7990), 80–85. https://www.nature.com/articles/s41586-023-06735-9 ↗ DOI: 10.1038/s41586-023-06735-9×4
- (2023). Simple and Controllable Music Generation. Advances in Neural Information Processing Systems 36 (NeurIPS 2023). https://arxiv.org/abs/2306.05284 ↗×6
- (2023). Evaluating Language-Model Agents on Realistic Autonomous Tasks. arXiv:2312.11671. https://arxiv.org/abs/2312.11671 ↗×4
- (2023). search Scholar ↗×2
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×5
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×3
- (2023). Executive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Federal Register, 88(210), 75191–75226 (Executive Order No. 14110, October 30, 2023). https://en.wikipedia.org/wiki/Executive_Order_14110 ↗×1
- (2023). Executive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Federal Register, 88(210), 75191–75226 (Executive Order 14110, signed October 30, 2023). https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence ↗×2
- ×7
- (2023). Does Writing with Language Models Reduce Content Diversity?. arXiv:2309.05196 (later published at ICLR 2024). https://arxiv.org/abs/2309.05196 ↗×1
- (2023).×2
not_found: details
No paper by Park, Lan, Tran, & Park (2023) on systematic documentation of citation hallucination could be located via web search. Known 2023 studies on this topic (e.g., Walters & Wilder 2023; Bhattacharyya et al. 2023; MacDonald 2023) have different author lists. The cited combination appears to be itself a hallucinated citation.
- (2023). search Scholar ↗×2
- (2023). YaRN: Efficient Context Window Extension of Large Language Models. https://arxiv.org/abs/2309.00071 ↗×5
- (2023). search Scholar ↗×2
- (2023). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. https://arxiv.org/abs/2305.18290 ↗×6
- (2023). search Scholar ↗×4
- (2023). search Scholar ↗×4
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark. Findings of the Association for Computational Linguistics: EMNLP 2023. https://arxiv.org/abs/2310.18018 ↗ DOI: 10.18653/v1/2023.findings-emnlp.722×3
- (2023). BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. Proceedings of the 40th International Conference on Machine Learning (ICML 2023). https://arxiv.org/abs/2301.12597 ↗ DOI: 10.5555/3618408.3619222×1
- (2023). Are Emergent Abilities of Large Language Models a Mirage?. https://arxiv.org/abs/2304.15004 ↗×9
- (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. Advances in Neural Information Processing Systems 36 (NeurIPS 2023). https://arxiv.org/abs/2302.04761 ↗×3
- (2023). GPT-4V(ision) System Card. OpenAI Technical Report (September 25, 2023). https://cdn.openai.com/papers/GPTV_System_Card.pdf ↗×1
- (2023). Towards Understanding Sycophancy in Language Models. arXiv:2310.13548. https://arxiv.org/abs/2310.13548 ↗×9
- (2023). search Scholar ↗×3
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- (2023). search Scholar ↗×1
- ×1
- (2023). Activation Addition: Steering Language Models Without Optimization. arXiv:2308.10248. https://arxiv.org/abs/2308.10248 ↗×3
- ×1
- (2023). Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), 18359–18369. https://arxiv.org/abs/2212.06909 ↗ DOI: 10.48550/arXiv.2212.06909×3
- (2023). De novo design of protein structure and function with RFdiffusion. Nature, 620(7976), 1089–1100. https://www.nature.com/articles/s41586-023-06415-8 ↗ DOI: 10.1038/s41586-023-06415-8×4
- (2023). Jailbroken: How Does LLM Safety Training Fail?. Advances in Neural Information Processing Systems 36 (NeurIPS 2023). https://arxiv.org/abs/2307.02483 ↗×5
- ×1
- (2023). Human Preference Score: Better Aligning Text-to-Image Models with Human Preference. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023). https://arxiv.org/abs/2303.14420 ↗×1
- (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. https://arxiv.org/abs/2305.10601 ↗×7
- (2023). MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers. https://arxiv.org/abs/2305.07185 ↗×3
- ×2
- (2023). Adding Conditional Control to Text-to-Image Diffusion Models. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023). https://arxiv.org/abs/2302.05543 ↗ DOI: 10.1109/ICCV51070.2023.00355×4
- (2023). PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel. Proceedings of the VLDB Endowment, 16(12), 3848–3860. https://arxiv.org/abs/2304.11277 ↗ DOI: 10.14778/3611540.3611569×1
- (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Datasets and Benchmarks Track. https://arxiv.org/abs/2306.05685 ↗×5
- (2023). search Scholar ↗×1
- (2023). Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. https://arxiv.org/abs/2205.10625 ↗×4
- (2023). search Scholar ↗×1
- (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405. https://arxiv.org/abs/2310.01405 ↗×7
- (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016), 493–500. https://www.nature.com/articles/s41586-024-07487-w ↗ DOI: 10.1038/s41586-024-07487-w×3
- (2024). Adobe Launches Firefly Video Model and Enhances Image, Vector and Design Models. Adobe Newsroom press release, October 14, 2024. https://news.adobe.com/news/2024/10/101424-adobe-launches-firefly-video-model ↗×1
- (2024). Chameleon: Mixed-Modal Early-Fusion Foundation Models. arXiv:2405.09818. https://arxiv.org/abs/2405.09818 ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×9
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×8
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). π₀: A Vision-Language-Action Flow Model for General Robot Control. arXiv:2410.24164. https://arxiv.org/abs/2410.24164 ↗×1
- (2024). NVIDIA Blackwell Architecture Technical Overview. NVIDIA Technical Brief / Whitepaper (2024). https://resources.nvidia.com/en-us-blackwell-architecture ↗×2
- (2024). Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. arXiv:2407.21787. https://arxiv.org/abs/2407.21787 ↗×1
- (2024). Matryoshka Sparse Autoencoders. AI Alignment Forum (blog post, December 19, 2024). https://www.alignmentforum.org/posts/zbebxYCqsryPALh8C/matryoshka-sparse-autoencoders ↗×1
- (2024). Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads. https://arxiv.org/abs/2401.10774 ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality. https://arxiv.org/abs/2405.21060 ↗×2
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×15
- ×1
- (2024). Humanity's Last Exam. arXiv:2501.14249. https://arxiv.org/abs/2501.14249 ↗ DOI: 10.48550/arXiv.2501.14249×1
- (2024). ColPali: Efficient Document Retrieval with Vision Language Models. arXiv:2407.01449. https://arxiv.org/abs/2407.01449 ↗×2
- (2024). Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. https://arxiv.org/abs/2402.02057 ↗×4
- (2024). Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations. Proceedings of the Third Conference on Causal Learning and Reasoning, PMLR 236, 160–187. https://proceedings.mlr.press/v236/geiger24a.html ↗×1
- (2024). Mochi 1: A new SOTA in open text-to-video. Genmo Blog (model release, October 2024); weights at https://huggingface.co/genmo/mochi-1-preview. https://www.genmo.ai/blog/mochi-1-a-new-sota-in-open-text-to-video ↗×1
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×10
- ×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). Large-scale foundation model on single-cell transcriptomics. Nature Methods, 21(8), 1481–1491. https://www.nature.com/articles/s41592-024-02305-7 ↗ DOI: 10.1038/s41592-024-02305-7×1
- (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning (ICML 2024). https://arxiv.org/abs/2402.12354 ↗×1
- (2024). AI Deception: A Survey of Examples, Risks, and Potential Solutions. https://arxiv.org/abs/2308.14752 ↗×1
ambiguous: details
No 2024 paper found with Hendrycks as first author on in-context/agentic LLM deception. Most plausible target given the context is Park, Goldstein, O'Gara, Chen, & Hendrycks (2024) 'AI Deception: A Survey...' in Patterns — but Hendrycks is the last author, so the in-text 'Hendrycks et al.' attribution appears to be a citation error. Cannot rule out that the author intended a different work (e.g., Scheurer et al. 2024 on strategic deception under pressure, or Hagendorff 2024 'Deception abilities emerged in LLMs').
- (2024). ORPO: Monolithic Preference Optimization without Reference Model. https://arxiv.org/abs/2403.07691 ↗×3
- (2024). RULER: What's the Real Context Size of Your Long-Context Language Models?. https://arxiv.org/abs/2404.06654 ↗×2
- (2024). search Scholar ↗×13
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×4
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). OpenVLA: An Open-Source Vision-Language-Action Model. arXiv:2406.09246. https://arxiv.org/abs/2406.09246 ↗×2
- (2024).×1
not_found: details
No Korbak-first-authored 2024 paper on game-theoretic analyses of AI safety/control protocols found. The clearly matching game-theoretic AI control paper is Griffin, Thomson, Shlegeris, & Abate (2024), 'Games for AI Control' (arXiv:2409.07985) — not Korbak. Korbak's relevant AI-control work, 'A sketch of an AI control safety case' (Korbak, Clymer, Hilton, Shlegeris, & Irving), is arXiv:2501.17315 from January 2025, not 2024, and is about safety cases rather than equilibrium/game analysis. Likely a hallucinated or misattributed citation.
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×4
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×6
- (2024). search Scholar ↗×3
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×5
- (2024). GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models. arXiv:2410.05229. https://arxiv.org/abs/2410.05229 ↗×2
- (2024). The Operational Risks of AI in Large-Scale Biological Attacks: Results of a Red-Team Study. RAND Corporation Research Report RR-A2977-2. https://www.rand.org/pubs/research_reports/RRA2977-2.html ↗ DOI: 10.7249/RRA2977-2×1
- (2024). Introducing computer use, a new Claude 3.5 Sonnet, and a new Claude 3.5 Haiku. Anthropic News (blog post), October 22, 2024. https://www.anthropic.com/news/3-5-models-and-computer-use ↗×6
- (2024). Learning to Reason with LLMs. OpenAI Research Blog (September 12, 2024). https://openai.com/index/learning-to-reason-with-llms/ ↗×14
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×4
- (2024). search Scholar ↗×1
- ×1
- (2024). Introducing Gen-3 Alpha: A New Frontier for Video Generation. Runway Research (product/model announcement). https://runwayml.com/research/introducing-gen-3-alpha ↗×1
- (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292. https://arxiv.org/abs/2408.06292 ↗ DOI: 10.48550/arXiv.2408.06292×2
- (2024). On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial. arXiv:2403.14380. https://arxiv.org/abs/2403.14380 ↗×3
- (2024). Introducing OpenAI o1-preview. OpenAI (blog announcement, September 12, 2024). https://openai.com/index/introducing-openai-o1-preview/ ↗×2
- (2024). Learning to Reason with LLMs. OpenAI (blog announcement), September 12, 2024. https://openai.com/index/learning-to-reason-with-llms/ ↗×2
- (2024). FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision. https://arxiv.org/abs/2407.08608 ↗×1
- (2024). DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. arXiv:2402.03300. https://arxiv.org/abs/2402.03300 ↗×2
- (2024). AI models collapse when trained on recursively generated data. https://www.nature.com/articles/s41586-024-07566-y ↗×5
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×1
- (2024). HunyuanVideo: A Systematic Framework For Large Video Generative Models. arXiv:2412.03603. https://arxiv.org/abs/2412.03603 ↗×1
- (2024). Trillium TPU is GA (sixth-generation Tensor Processing Unit). Google Cloud Blog (announcement, 2024). https://blog.google/feed/trillium-tpus/ ↗×1
- (2024). Solving olympiad geometry without human demonstrations. Nature, 625(7995), 476–482. https://www.nature.com/articles/s41586-023-06747-5 ↗ DOI: 10.1038/s41586-023-06747-5×3
- (2024). search Scholar ↗×1
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×2
- (2024). search Scholar ↗×2
- (2025). search Scholar ↗×2
- (2025). search Scholar ↗×1
- (2025). Introducing deep research. OpenAI (product announcement, February 2, 2025). https://openai.com/index/introducing-deep-research/ ↗×1
- (2025). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv:2501.12948. https://arxiv.org/abs/2501.12948 ↗ DOI: 10.48550/arXiv.2501.12948×4
- (2025). Claude's extended thinking. Anthropic (company blog/news). https://www.anthropic.com/news/visible-extended-thinking ↗×2
- (2025). Claude 3.7 Sonnet and Claude Code. Anthropic News (announcement, February 24, 2025). https://www.anthropic.com/news/claude-3-7-sonnet ↗×2
- (2025). search Scholar ↗×3
- (2025). search Scholar ↗×6
- (2025). search Scholar ↗×3
- (2025). search Scholar ↗×3
- (2025). search Scholar ↗×1
- (2025). search Scholar ↗×1