User contributions for Fall2024 Wiki Team6
Jump to navigation
Jump to search
15 December 2024
- 22:4622:46, 15 December 2024 diff hist −12 Adafactor →Numerical Examples current Tag: Visual edit
- 22:4322:43, 15 December 2024 diff hist −58 Adafactor →Step 2: Compute G t 2 {\displaystyle G_{t}^{2}} (Element-wise Square of Gradient) Tag: Visual edit
- 22:0222:02, 15 December 2024 diff hist +495 Adafactor →Problem setup Tag: Visual edit
14 December 2024
- 23:1223:12, 14 December 2024 diff hist −3 Adafactor →Numerical Examples Tag: Visual edit
- 23:0623:06, 14 December 2024 diff hist −34 Adafactor No edit summary Tag: Visual edit
- 23:0423:04, 14 December 2024 diff hist −54 Adafactor No edit summary Tag: Visual edit
- 22:4322:43, 14 December 2024 diff hist +4 Adafactor →Numerical Examples Tag: Visual edit
13 December 2024
- 17:5417:54, 13 December 2024 diff hist +4 Adafactor →Software Tools and Platforms Tag: Visual edit
- 17:5117:51, 13 December 2024 diff hist +1,557 Adafactor →Conclusion Tag: Visual edit
- 17:4817:48, 13 December 2024 diff hist −34 Adafactor →Applications: change tensorflow Tag: Visual edit
12 December 2024
- 12:5812:58, 12 December 2024 diff hist −18 Adafactor →Proposed Hyperparameters for Adafactor
- 12:5712:57, 12 December 2024 diff hist +123 Adafactor →4. Proposed Hyperparameters for Adafactor
- 12:5612:56, 12 December 2024 diff hist +1,096 Adafactor →4. Proposed Hyperparameters for Adafactor
- 12:5412:54, 12 December 2024 diff hist +29 Adafactor →Problem formulation
- 12:5312:53, 12 December 2024 diff hist +507 Adafactor Undo revision 6932 by Fall2024 Wiki Team6 (talk) Tag: Undo
- 12:4912:49, 12 December 2024 diff hist +1,013 Adafactor Undo revision 6933 by Fall2024 Wiki Team6 (talk) Tag: Undo
11 December 2024
- 21:0721:07, 11 December 2024 diff hist +1,665 Adafactor →Applications Tag: Visual edit: Switched
- 21:0221:02, 11 December 2024 diff hist +2,024 Adafactor →Introduction Tag: Visual edit: Switched
- 20:5120:51, 11 December 2024 diff hist +21 Adafactor →Introduction
- 20:4920:49, 11 December 2024 diff hist +1,249 Adafactor →Introduction Tag: Visual edit: Switched
- 18:0218:02, 11 December 2024 diff hist −3 Adafactor →Numerical Examples Tag: Visual edit
- 17:5717:57, 11 December 2024 diff hist +329 Adafactor →Numerical Examples Tag: Visual edit
- 17:4417:44, 11 December 2024 diff hist +126 Adafactor →Numerical Examples Tag: Visual edit
- 17:2317:23, 11 December 2024 diff hist +615 Adafactor →Numerical Examples Tag: Visual edit
- 13:1013:10, 11 December 2024 diff hist +19 Adafactor →Numerical Examples Tag: Visual edit
- 03:0003:00, 11 December 2024 diff hist −11 Adafactor →Numerical Examples Tag: Visual edit
- 02:5802:58, 11 December 2024 diff hist +4,080 Adafactor →Numerical Examples Tag: Visual edit
- 00:2600:26, 11 December 2024 diff hist −1,314 Adafactor →Numerical Examples Tag: Visual edit
- 00:2300:23, 11 December 2024 diff hist +2 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:2300:23, 11 December 2024 diff hist +2 Adafactor →Why Clipping
- 00:2300:23, 11 December 2024 diff hist +1 Adafactor →5.Discussion
- 00:2300:23, 11 December 2024 diff hist −1,013 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:2300:23, 11 December 2024 diff hist −507 Adafactor →Why Clipping
- 00:2200:22, 11 December 2024 diff hist +1,537 Adafactor →Problem formulation
- 00:2100:21, 11 December 2024 diff hist −16 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:2000:20, 11 December 2024 diff hist +2 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:1900:19, 11 December 2024 diff hist +117 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:1800:18, 11 December 2024 diff hist +7 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:1700:17, 11 December 2024 diff hist +97 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:1500:15, 11 December 2024 diff hist +281 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0800:08, 11 December 2024 diff hist +54 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0700:07, 11 December 2024 diff hist +13 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0600:06, 11 December 2024 diff hist +180 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0400:04, 11 December 2024 diff hist +2 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0400:04, 11 December 2024 diff hist +197 Adafactor →Why Adafactor is more memory efficient, compared to Adam
- 00:0300:03, 11 December 2024 diff hist +13 Adafactor →Why Adafactor is more memory efficient, compared to Adam
10 December 2024
- 23:5923:59, 10 December 2024 diff hist +68 Adafactor →Why Clipping
- 23:5423:54, 10 December 2024 diff hist +130 Adafactor →Why Clipping
- 23:5223:52, 10 December 2024 diff hist +379 Adafactor →Adafactor for Weighted Matrices
- 23:5223:52, 10 December 2024 diff hist −379 Adafactor →Why Clipping