Table of Links
Impact statement, and References
Impact statement
Transparency Index. Given the exponential rise in societal reliance on large foundation models, ensuring transparency in their training approach, architecture, and downstream application is crucial for public accountability, scientific advancement, and effective governance. o uphold this objective, we publicly release our training code base, data sources, and evaluation pipeline. We assess the transparency of MOMENT using the criteria outlined by Bommasani et al. (2023), focusing on upstream resources utilized during training and model description, encompassing 32 and 33 transparency indicators, respectively. We report expected upstream and model transparency scores for MOMENT in Tab. 34. Notably, MOMENT is expected to have one of the highest levels of upstream transparency. However, itβs model transparency scores are lower, primarily due to comprehensive (external and third-party) harm and trustworthiness evaluations, which are not well understood in the context of time series modeling.
Environmental Impact. We train multiple models over many days resulting in significant energy usage and a sizeable carbon footprint. However, we hope that releasing our models will ensure that future time series modeling efforts are quicker and more efficient, resulting in lower carbon emissions.
We follow prior work (Bender et al., 2021; Patterson et al., 2021; Touvron et al., 2023; Wu et al., 2022; Dodge et al., 2022) and estimate the carbon footprint of pre-training all variants of MOMENT based on the GPU device used and the carbon efficiency of the electricity grid. Our estimated CO2 generation estimates are shown in Tab. 8.
We use the Total Graphics Power (TGP) to calculate the total power consumed for training MOMENT models, although the total power consumed by the GPU will likely vary a little based on the GPU utilization while training our model. Our calculations do not account for power demands from other sources of our compute. We use 336.566 Kg C02/MWH as the standard value of CO2 emission per megawatt hour of energy consumed for Pittsburgh[8].
We share an upper limit of the individual CO2 emission for each model, as well as a more realistic actual estimate for the carbon emissions from MOMENT-small and MOMENT-base, since they were trained simultaneously on a single Nvidia RTX A6000 GPU, and thus the power consumed by the GPU was shared for the training of both variants. MOMENT-large was trained independently on a single RTX A6000 GPU.
Ethical considerations and potential misuse. Despite MOMENTβs promising performance in limited-data settings, it is important to use its predictions with care, especially in high-stakes settings such as healthcare. Before MOMENT is used for high-stakes decision-making, we recommend fine-tuning and evaluating the model with task-specific in-domain data.
References
Ansari, A. F., Stella, L., Turkmen, C., Zhang, X., Mercado, P., Shen, H., Shchur, O., Rangapuram, S. S., Arango, S. P., Kapoor, S., et al. Chronos: Learning the language of time series. arXiv preprint arXiv:2403.07815, 2024.
Ba, J. L., Kiros, J. R., and Hinton, G. E. Layer normalization, 2016.
Bao, H., Dong, L., Piao, S., and Wei, F. BEit: BERT pretraining of image transformers. In International Conference on Learning Representations, 2022. URL https: //openreview.net/forum?id=p-BhZSz59o4.
Bender, E. M., Gebru, T., McMillan-Major, A., and Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT β21, pp. 610β623, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383097. doi: 10.1145/ 3442188.3445922. URL https://doi.org/10. 1145/3442188.3445922.
Bommasani, R., Klyman, K., Longpre, S., Kapoor, S., Maslej, N., Xiong, B., Zhang, D., and Liang, P. The foundation model transparency index, 2023.
Cai, Y., Goswami, M., Choudhry, A., Srinivasan, A., and Dubrawski, A. Jolt: Jointly learned representations of language and time-series. In Deep Generative Models for Health Workshop NeurIPS 2023, 2023.
California Department of Transportation. Performance measurement system (pems), 2024. URL http://pems. dot.ca.gov/. Accessed: 2024-02-01.
Cao, D., Jia, F., Arik, S. O., Pfister, T., Zheng, Y., Ye, W., and Liu, Y. Tempo: Prompt-based generative pre-trained transformer for time series forecasting, 2023.
Centers for Disease Control and Prevention. Fluview: Flu activity & surveillance, 2024. URL https://gis.cdc.gov/grasp/fluview/ fluportaldashboard.html. Accessed: 2024-02- 01.
Challu, C., Olivares, K. G., Oreshkin, B. N., Garza Ramirez, F., Mergenthaler Canseco, M., and Dubrawski, A. NHITS: Neural hierarchical interpolation for time series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6):6989β6997, Jun. 2023. doi: 10.1609/ aaai.v37i6.25854. URL https://ojs.aaai.org/ index.php/AAAI/article/view/25854.
Challu, C. I., Jiang, P., Nian Wu, Y., and Callot, L. Deep generative model with hierarchical latent factors for time series anomaly detection. In Camps-Valls, G., Ruiz, F. J. R., and Valera, I. (eds.), Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pp. 1643β1654. PMLR, 28β30 Mar 2022. URL https://proceedings.mlr.press/ v151/challu22a.html.
Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., et al. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022.
Cui, Z., Chen, W., and Chen, Y. Multi-scale convolutional neural networks for time series classification, 2016.
Das, A., Kong, W., Sen, R., and Zhou, Y. A decoderonly foundation model for time-series forecasting. arXiv preprint arXiv:2310.10688, 2023.
Dau, H. A., Keogh, E., Kamgar, K., Yeh, C.-C. M., Zhu, Y., Gharghabi, S., Ratanamahatana, C. A., Yanping, Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G., and Hexagon-ML. The ucr time series classification archive, October 2018. https://www.cs.ucr. edu/Λeamonn/time_series_data_2018/.
Day, K., Christl, D., Salvi, R., and Sriram, P. Video pretrained transformer: A multimodal mixture of pre-trained experts, 2023.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C., and Solorio, T. (eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171β4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
Dodge, J., Prewitt, T., Tachet des Combes, R., Odmark, E., Schwartz, R., Strubell, E., Luccioni, A. S., Smith, N. A., DeCario, N., and Buchanan, W. Measuring the carbon intensity of ai in cloud instances. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT β22, pp. 1877β1894, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393522. doi: 10.1145/ 3531146.3533234. URL https://doi.org/10. 1145/3531146.3533234.
Dong, J., Wu, H., Zhang, H., Zhang, L., Wang, J., and Long, M. Simmtm: A simple pre-training framework for masked time-series modeling. In Advances in Neural Information Processing Systems, 2023.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https:// openreview.net/forum?id=YicbFdNTTy.
Ekambaram, V., Jati, A., Nguyen, N. H., Dayama, P., Reddy, C., Gifford, W. M., and Kalagnanam, J. Tiny time mixers (ttms): Fast pre-trained models for enhanced zero/fewshot forecasting of multivariate time series, 2024.
Eldele, E., Ragab, M., Chen, Z., Wu, M., Kwoh, C. K., Li, X., and Guan, C. Time-series representation learning via temporal and contextual contrasting. In Zhou, Z.- H. (ed.), Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2352β 2359. International Joint Conferences on Artificial Intelligence Organization, 8 2021. doi: 10.24963/ijcai.2021/ 324. URL https://doi.org/10.24963/ijcai. 2021/324. Main Track.
Franceschi, J.-Y., Dieuleveut, A., and Jaggi, M. Unsupervised scalable representation learning for multivariate time series. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., and Garnett, R. Β΄ (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips. cc/paper_files/paper/2019/file/ 53c6de78244e9f528eb3e1cda69699bb-Paper. pdf.
Gao, L., Biderman, S., Black, S., Golding, L., Hoppe, T., Foster, C., Phang, J., He, H., Thite, A., Nabeshima, N., Presser, S., and Leahy, C. The pile: An 800gb dataset of diverse text for language modeling, 2020.
Garza, A. and Mergenthaler-Canseco, M. Timegpt-1. arXiv preprint arXiv:2310.03589, 2023.
Garza, F., Mergenthaler Canseco, M., Challu, C., and Β΄ Olivares, K. StatsForecast: Lightning fast forecasting with statistical and econometric models. PyCon Salt Lake City, Utah, US 2022, 2022. URL https: //github.com/Nixtla/statsforecast.
Godahewa, R. W., Bergmeir, C., Webb, G. I., Hyndman, R., and Montero-Manso, P. Monash time series forecasting archive. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. URL https://openreview. net/forum?id=wEc1mgAjU-.
Goswami, M., Boecking, B., and Dubrawski, A. Weak supervision for affordable modeling of electrocardiogram data. In AMIA Annual Symposium Proceedings, volume 2021, pp. 536. American Medical Informatics Association, 2021.
Goswami, M., Challu, C. I., Callot, L., Minorics, L., and Kan, A. Unsupervised model selection for time series anomaly detection. In The Eleventh International Conference on Learning Representations, 2023a. URL https: //openreview.net/forum?id=gOZ_pKANaPW.
Goswami, M., Sanil, V., Choudhry, A., Srinivasan, A., Udompanyawit, C., and Dubrawski, A. AQua: A benchmarking tool for label quality assessment. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023b. URL https://openreview.net/forum? id=dhJ8VbcEtX.
Gruver, N., Finzi, M. A., Qiu, S., and Wilson, A. G. Large language models are zero-shot time series forecasters. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview. net/forum?id=md68e8iZK1.
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770β778, 2016. doi: 10.1109/CVPR.2016.90.
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., and Soderstrom, T. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 387β 395, 2018.
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery, 33(4):917β963, 2019.
Jin, M., Wang, S., Ma, L., Chu, Z., Zhang, J. Y., Shi, X., Chen, P.-Y., Liang, Y., Li, Y.-F., Pan, S., and Wen, Q. Time-llm: Time series forecasting by reprogramming large language models, 2023.
Kim, T., Kim, J., Tae, Y., Park, C., Choi, J.-H., and Choo, J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum? id=cGDAkQo1C0p.
Lai, G., Chang, W.-C., Yang, Y., and Liu, H. Modeling longand short-term temporal patterns with deep neural networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR β18, pp. 95β104, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450356572. doi: 10.1145/3209978.3210006. URL https://doi. org/10.1145/3209978.3210006.
Le Guennec, A., Malinowski, S., and Tavenard, R. Data Augmentation for Time Series Classification using Convolutional Neural Networks. In ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data, Riva Del Garda, Italy, September 2016. URL https: //shs.hal.science/halshs-01357973.
Li, J., Li, D., Savarese, S., and Hoi, S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597, 2023a.
Li, S., Jin, X., Xuan, Y., Zhou, X., Chen, W., Wang, Y.- X., and Yan, X. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32, 2019.
Li, Y., Fan, H., Hu, R., Feichtenhofer, C., and He, K. Scaling language-image pre-training via masking. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23390β23400, Los Alamitos, CA, USA, jun 2023b. IEEE Computer Society. doi: 10.1109/CVPR52729.2023.02240. URL https://doi.ieeecomputersociety.org/ 10.1109/CVPR52729.2023.02240.
Li, Z., Rao, Z., Pan, L., Wang, P., and Xu, Z. Ti-mae: Self-supervised masked time series autoencoders, 2023c.
Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A. X., and Dustdar, S. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum? id=0EXmFzUn5I.
Liu, Y., Hu, T., Zhang, H., Wu, H., Wang, S., Ma, L., and Long, M. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023.
Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019. URL https://openreview. net/forum?id=Bkg6RiCqY7.
Lu, K., Grover, A., Abbeel, P., and Mordatch, I. Frozen pretrained transformers as universal computation engines. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7):7628β7636, Jun. 2022. doi: 10.1609/ aaai.v36i7.20729. URL https://ojs.aaai.org/ index.php/AAAI/article/view/20729.
Ma, Q., Liu, Z., Zheng, Z., Huang, Z., Zhu, S., Yu, Z., and Kwok, J. T. A survey on time-series pre-trained models, 2023.
Max Planck Institute for Biogeochemistry. Weather data, 2024. URL https://www.bgc-jena.mpg.de/ wetter/. Accessed: 2024-02-01.
Narwariya, J., Malhotra, P., Vig, L., Shroff, G., and Vishnu, T. V. Meta-learning for few-shot time series classification. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, CoDS COMAD 2020, pp. 28β36, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450377386. doi: 10. 1145/3371158.3371162. URL https://doi.org/ 10.1145/3371158.3371162.
Nie, Y., Nguyen, N. H., Sinthong, P., and Kalagnanam, J. A time series is worth 64 words: Long-term forecasting with transformers. In The Eleventh International Conference on Learning Representations, 2023. URL https:// openreview.net/forum?id=Jbdc0vTOcol.
Oreshkin, B. N., Carpov, D., Chapados, N., and Bengio, Y. N-beats: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations, 2020. URL https: //openreview.net/forum?id=r1ecqn4YwB.
Oreshkin, B. N., Carpov, D., Chapados, N., and Bengio, Y. Meta-learning framework with applications to zero-shot time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 35(10): 9242β9250, May 2021. doi: 10.1609/aaai.v35i10. 17115. URL https://ojs.aaai.org/index. php/AAAI/article/view/17115.
Paparrizos, J., Boniol, P., Palpanas, T., Tsay, R. S., Elmore, A., and Franklin, M. J. Volume under the surface: A new accuracy evaluation measure for timeseries anomaly detection. Proc. VLDB Endow., 15(11): 2774β2787, jul 2022a. ISSN 2150-8097. doi: 10. 14778/3551793.3551830. URL https://doi.org/ 10.14778/3551793.3551830.
Paparrizos, J., Kang, Y., Boniol, P., Tsay, R. S., Palpanas, T., and Franklin, M. J. Tsb-uad: An end-to-end benchmark suite for univariate time-series anomaly detection. Proc. VLDB Endow., 15(8):1697β1711, apr 2022b. ISSN 2150- 8097. doi: 10.14778/3529337.3529354. URL https: //doi.org/10.14778/3529337.3529354.
Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.- M., Rothchild, D., So, D., Texier, M., and Dean, J. Carbon emissions and large neural network training, 2021.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., and Sutskever, I. Learning transferable visual models from natural language supervision. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 8748β8763. PMLR, 18β24 Jul 2021. URL https://proceedings.mlr.press/ v139/radford21a.html.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1β67, 2020. URL http://jmlr. org/papers/v21/20-074.html.
Ramaswamy, S., Rastogi, R., and Shim, K. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD β00, pp. 427β438, New York, NY, USA, 2000. Association for Computing Machinery. ISBN 1581132174. doi: 10.1145/342009.335437. URL https://doi.org/ 10.1145/342009.335437.
Rasul, K., Ashok, A., Williams, A. R., Khorasani, A., Adamopoulos, G., Bhagwatkar, R., Bilos, M., Ghonia, H., Λ Hassen, N. V., Schneider, A., et al. Lag-llama: Towards foundation models for time series forecasting. arXiv preprint arXiv:2310.08278, 2023.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211β252, 2015. doi: 10.1007/s11263-015-0816-y.
Schmidl, S., Wenig, P., and Papenbrock, T. Anomaly detection in time series: A comprehensive evaluation. Proc. VLDB Endow., 15(9):1779β1797, may 2022. ISSN 2150- 8097. doi: 10.14778/3538598.3538602. URL https: //doi.org/10.14778/3538598.3538602.
Schneider, S. H. and Dickinson, R. E. Climate modeling. Reviews of Geophysics, 12(3):447β493, 1974.
Serra, J., Pascual, S., and Karatzoglou, A. Towards a ` universal neural network encoder for time series. In International Conference of the Catalan Association for Artificial Intelligence, 2018. URL https://api. semanticscholar.org/CorpusID:13675490.
Shaw, P., Uszkoreit, J., and Vaswani, A. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155, 2018.
Shen, J., Li, L., Dery, L. M., Staten, C., Khodak, M., Neubig, G., and Talwalkar, A. Cross-modal fine-tuning: Align then refine, 2023.
Smith, L. N. and Topin, N. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multidomain operations applications, volume 11006, pp. 369β 386. SPIE, 2019.
Su, Y., Zhao, Y., Niu, C., Liu, R., Sun, W., and Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 2828β2837, 2019.
Talukder, S., Yue, Y., and Gkioxari, G. Totem: Tokenized time series embeddings for general time series analysis, 2024.
Tanisaro, P. and Heidemann, G. Time series classification using time warping invariant echo state networks. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 831β836, 2016. doi: 10.1109/ICMLA.2016.0149.
Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Yung, J., Steiner, A., Keysers, D., Uszkoreit, J., et al. Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems, 34:24261β24272, 2021.
Tonekaboni, S., Eytan, D., and Goldenberg, A. Unsupervised representation learning for time series with temporal neighborhood coding. In International Conference on Learning Representations, 2021. URL https: //openreview.net/forum?id=8qDwejCuCN.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C., Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu, J., Fu, W., Fuller, B., Gao, C., Goswami, V., Goyal, N., Hartshorn, A., Hosseini, S., Hou, R., Inan, H., Kardas, M., Kerkez, V., Khabsa, M., Kloumann, I., Korenev, A., Koura, P. S., Lachaux, M.-A., Lavril, T., Lee, J., Liskovich, D., Lu, Y., Mao, Y., Martinet, X., Mihaylov, T., Mishra, P., Molybog, I., Nie, Y., Poulton, A., Reizenstein, J., Rungta, R., Saladi, K., Schelten, A., Silva, R., Smith, E. M., Subramanian, R., Tan, X. E., Tang, B., Taylor, R., Williams, A., Kuan, J. X., Xu, P., Yan, Z., Zarov, I., Zhang, Y., Fan, A., Kambadur, M., Narang, S., Rodriguez, A., Stojnic, R., Edunov, S., and Scialom, T. Llama 2: Open foundation and fine-tuned chat models, 2023.
Trindade, A. ElectricityLoadDiagrams20112014. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C58C86.
Van Den Oord, A., Vinyals, O., et al. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
van der Maaten, L. Accelerating t-sne using tree-based algorithms. Journal of Machine Learning Research, 15 (93):3221β3245, 2014. URL http://jmlr.org/ papers/v15/vandermaaten14a.html.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I. Attention is all you need. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips. cc/paper_files/paper/2017/file/ 3f5ee243547dee91fbd053c1c4a845aa-Paper. pdf.
Wang, Z., Yan, W., and Oates, T. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1578β1585, 2017. doi: 10.1109/IJCNN.2017.7966039.
Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., and Sun, L. Transformers in time series: A survey. In Elkind, E. (ed.), Proceedings of the ThirtySecond International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 6778β6786. International Joint Conferences on Artificial Intelligence Organization, 8 2023. doi: 10.24963/ijcai.2023/759. URL https: //doi.org/10.24963/ijcai.2023/759. Survey Track.
Woo, G., Liu, C., Kumar, A., Xiong, C., Savarese, S., and Sahoo, D. Unified training of universal time series forecasting transformers. arXiv preprint arXiv:2402.02592, 2024.
Wu, C.-J., Raghavendra, R., Gupta, U., Acun, B., Ardalani, N., Maeng, K., Chang, G., Behram, F. A., Huang, J., Bai, C., Gschwind, M., Gupta, A., Ott, M., Melnikov, A., Candido, S., Brooks, D., Chauhan, G., Lee, B., Lee, H.-H. S., Akyildiz, B., Balandat, M., Spisak, J., Jain, R., Rabbat, M., and Hazelwood, K. Sustainable ai: Environmental implications, challenges and opportunities, 2022.
Wu, H., Xu, J., Wang, J., and Long, M. Autoformer: Decomposition transformers with auto-correlation for longterm series forecasting. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https: //openreview.net/forum?id=I55UqU-M11y.
Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J., and Long, M. Timesnet: Temporal 2d-variation modeling for general time series analysis. In The Eleventh International Conference on Learning Representations, 2023. URL https: //openreview.net/forum?id=ju_Uqw384Oq.
Wu, R. and Keogh, E. J. Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress. IEEE Transactions on Knowledge & Data Engineering, 35(03):2421β2429, mar 2023. ISSN 1558- 2191. doi: 10.1109/TKDE.2021.3112126.
Xie, Z., Zhang, Z., Cao, Y., Lin, Y., Bao, J., Yao, Z., Dai, Q., and Hu, H. Simmim: a simple framework for masked image modeling. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9643β 9653, 2022. doi: 10.1109/CVPR52688.2022.00943.
Xu, H., Chen, W., Zhao, N., Li, Z., Bu, J., Li, Z., Liu, Y., Zhao, Y., Pei, D., Feng, Y., et al. Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. In Proceedings of the 2018 world wide web conference, pp. 187β196, 2018.
Xu, J., Wu, H., Wang, J., and Long, M. Anomaly transformer: Time series anomaly detection with association discrepancy. In International Conference on Learning Representations, 2022. URL https://openreview. net/forum?id=LzQQ89U1qm_.
Yue, Z., Wang, Y., Duan, J., Yang, T., Huang, C., Tong, Y., and Xu, B. Ts2vec: Towards universal representation of time series. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8):8980β8987, Jun. 2022. doi: 10. 1609/aaai.v36i8.20881. URL https://ojs.aaai. org/index.php/AAAI/article/view/20881.
Zebik, M., Korytkowski, M., Angryk, R., and Scherer, R. Convolutional Neural Networks for Time Series Classification, pp. 635β642. Springer International Publishing, Cham, 2017. ISBN 978-3-319-59060-8. doi: 10.1007/978-3-319-59060-8 57. URL https://doi. org/10.1007/978-3-319-59060-8_57.
Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., and Eickhoff, C. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD β21,pp. 2114β2124, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383325. doi: 10.1145/3447548.3467401. URL https://doi. org/10.1145/3447548.3467401.
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., and Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12):11106β11115, May 2021. doi: 10.1609/ aaai.v35i12.17325. URL https://ojs.aaai.org/ index.php/AAAI/article/view/17325.
Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., and Jin, R. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proc. 39th International Conference on Machine Learning (ICML 2022), 2022.
Zhou, T., Niu, P., Wang, X., Sun, L., and Jin, R. One fits all: Power general time series analysis by pretrained LM. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview. net/forum?id=gMS6FVZvmF.
Authors:
(1) Mononito Goswami, Auton Lab, Robotics Insititute, Carnegie Mellon University, Pittsburgh, USA ([email protected])
(2) Konrad Szafer, Auton Lab, Robotics Institute, Carnegie Mellon University, Pittsburgh, USA, with equal contribution, order decided using a random generator;
(3) Arjun Choudhry, Auton Lab, Robotics Institute, Carnegie Mellon University, Pittsburgh, USA, with equal contribution, order decided using a random generator;
(4) Yifu Cai, Auton Lab, Robotics Institute, Carnegie Mellon University, Pittsburgh, USA;
(5) Shuo Li, University of Pennsylvania, Philadelphia, USA;
(6) Artur Dubrawski, Auton Lab, Robotics Institute, Carnegie Mellon University, Pittsburgh, USA.
This paper is
[8] https://emissionsindex.org/