HypothesisTests.jl
HypothesisTests.jl is a Julia package that implements a wide range of hypothesis tests.
Quick start
Some examples:
Required Packages
- ChainRulesCore
- ChangesOfVariables
- Combinatorics
- CommonSolve
- ConstructionBase
- DataStructures
- DensityInterface
- Distributions
- DocStringExtensions
- DualNumbers
- HypergeometricFunctions
- InverseFunctions
- IrrationalConstants
- LogExpFunctions
- OrderedCollections
- SortingAlgorithms
- SparseArrays
- SpecialFunctions
- StaticArraysCore
- SuiteSparse
Used By Packages
- BEASTDataPrep
- BosonSampling
- CalibrationAnalysis
- CalibrationTests
- CancerSeqSim
- CausalGPSLC
- CausalityTools
- Crispulator
- CriticalDifferenceDiagrams
- DatagenCopulaBased
- ExoplanetsSysSim
- FeatureSelectors
- GreedyAlign
- HierarchialPerformanceTest
- InvariantCausalPrediction
- LinearRegressionKit
- MagnitudeDistributions
- MatrixCompletion
- MatrixMerge
- MCMCDebugging
- MendelImpute
- NeuroAnalysis
- NighttimeLights
- OrdinalGWAS
- RNAForecaster
- ScoreDrivenModels
- SpeciesToNetworks
- Sqlite3Stats
- VisualizeMotifs
Julia Packages
This website serves as a package browsing tool for the Julia programming language. It works by aggregating various sources on Github to help you find your next package.
By analogy, Julia Packages operates much like PyPI , Ember Observer , and Ruby Toolbox do for their respective stacks.
Navigation Menu
Search code, repositories, users, issues, pull requests..., provide feedback.
We read every piece of feedback, and take your input very seriously.
Saved searches
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
- Notifications You must be signed in to change notification settings
Releases: JuliaStats/HypothesisTests.jl
HypothesisTests v0.11.0
Diff since v0.10.13
Closed issues:
- Please extend pvalue from StatsAPI.jl ( #290 )
- Wald interval with continuity correction ( #292 )
- Add documentation link on the about section. ( #298 )
Merged pull requests:
- Wald interval with continuity correction ( #291 ) ( @PharmCat )
- Use StatsAPI.pvalue + StatsAPI.HypothesisTest ( #297 ) ( @devmotion )
Contributors
- 🚀 1 reaction
HypothesisTests v0.10.13
Diff since v0.10.12
- Lower and upper limits of CI for proportion ( #295 )
- Bound binomial CI ( #296 ) ( @palday )
HypothesisTests v0.10.12
Diff since v0.10.11
- fix ApproximateMannWhitney for Float32 ( #113 ) ( @bjarthur )
- Enable the two sided KG Test to deal with Ties. ( #282 ) ( @FHell )
- Update z.jl ( #283 ) ( @itsdebartha )
- Pearson's Chi-Squared Test with 2 vectors ( #289 ) ( @itsdebartha )
- Modified Jarque-Bera test (Urzua, 1996) ( #293 ) ( @itsdebartha )
- CompatHelper: bump compat for StatsBase to 0.34, (keep existing compat) ( #294 ) (@github-actions[bot])
HypothesisTests v0.10.11
Diff since v0.10.10
- Problem with types in ExactPermutationTest ( #274 )
- FisherExactTest dies horribly on corner case inputs ( #276 )
- BUG: Fisher test corner case/gh #276 ( #279 ) ( @kshedden )
HypothesisTests v0.10.10
Diff since v0.10.9
- confint take either alpha everywhere or level everywhere ( #268 )
- Remove alpha from docstrings ( #270 ) ( @devmotion )
HypothesisTests v0.10.9
Diff since v0.10.8
- Error computing p-value for KruskalWallisTest in Julia v1.6 ( #269 )
- Group docstrings for pvalue and confint with their tests ( #265 ) ( @nalimilan )
- CompatHelper: bump compat for Roots to 2, (keep existing compat) ( #271 ) (@github-actions[bot])
HypothesisTests v0.10.8
Diff since v0.10.7
- p val > 1 in ExactMannWhitneyUTest ( #126 )
- fix pvalue > 1 in MannWhitneyUTest ( #126 ) ( #243 ) ( @pdeffebach )
HypothesisTests v0.10.7
Diff since v0.10.6
- A paired t-test is not necessarily a one sample test ( #246 )
- Help with Kruskal-Wallis Test ( #253 )
- Wrong calculation for p-value for FDist ( #255 )
- Tail are opposite for pvalue and confint ( #256 )
- docs need an update ( #258 )
- Mention other names for OneSampleTTest ( #247 ) ( @rikhuijzer )
- Use StableRNG in more tests to avoid breakage in Julia 1.7 ( #252 ) ( @andreasnoack )
- Update badges and TagBot action ( #259 ) ( @nalimilan )
- Use more recent Documenter ( #260 ) ( @nalimilan )
- Run CompatHelper also for docs subdirectory ( #261 ) ( @devmotion )
- Fix TagBot action ( #262 ) ( @nalimilan )
- Fix some type instabilities of pvalue + confint ( #264 ) ( @devmotion )
HypothesisTests v0.10.6
Diff since v0.10.5
HypothesisTests v0.10.5
Diff since v0.10.4
- Help with the reviewing process ( #240 )
- FisherExactTest(1,1,1,1) "Need extrema to return two distinct values" ( #245 )
- Add t-test for cases where only stats are known ( #237 ) ( @rikhuijzer )
- Fix incorrect bracket and type instability in FisherExactTest ( #249 ) ( @devmotion )
BayesTesting.jl: Bayesian Hypothesis Testing without Tears
BayesTesting.jl implements a fresh approach to hypothesis testing in Julia. The Jeffreys-Lindley-Bartlett paradox does not occur. Any prior can be employed, including uninformative and reference priors, so the same prior employed for inference can be used for testing, and objective Bayesian posteriors can be used for testing. In standard problems when the posterior distribution matches (numerically) the frequentist sampling distribution or likelihood, there is a one-to-one correspondence with the frequentist test. The resulting posterior odds against the null hypothesis are easy to interpret (unlike p-values), do not violate the likelihood principle, and result from minimizing a linear combination of type I and II errors rather than fixing the type I error before testing. The testing procedure satisfies the Neyman-Pearson lemma, so tests are uniformly most powerful, and satisfy the most general Bayesian robustness theorem. BayesTesting.jl provides functions for a variety of standard testing situations, along with more generic modular functions to easily allow the testing procedure to be employed in novel situations. For example, given any Monte Carlo or MCMC posterior sample for an unknown quantity, θ, the generic BayesTesting.jl function mcodds can be used to test hypotheses concerning θ, often with one or two lines of code. The talk will demonstrate application of the new approach in several standard situations, including testing a sample mean, comparison of means, and regression parameter testing. A brief presentation of the methodology will be followed by examples. The examples illustrate our experiences with implementing the new testing procedure in Julia over the past year.
Speaker's bio
Julia/Economics
A tutorial series for economists learning to program in the julia language, hypothesis testing, stepdown p-values for multiple hypothesis testing in julia.
* The script to reproduce the results of this tutorial in Julia is located here .
To finish yesterday’s tutorial on hypothesis testing with non-parametric p -values in Julia, I show how to perform the bootstrap stepdown correction to p -values for multiple testing, which many economists find intimidating (including me) but is actually not so difficult.
First, I will show the simple case of Holm’s (1979) stepdown procedure. Then, I will build on the bootstrap example to apply one of the most exciting advances in econometric theory from the past decade, the bootstrap step-down procedure, developed by Romano and Wolf (2004) and extended by Chicago professor Azeem Shaikh . These methods allow one to bound the family-wise error rate, which is the probability of making at least one false discovery (i.e., rejecting a null hypothesis when the null hypothesis was true).
The Romano-Wolf-Shaikh approach is to account for the dependence among the estimators, and penalize estimators more as they become more independent. Here’s how I motivate their approach: First, consider the extreme case of two estimators that are identical. Then, if we reject the null hypothesis for the first, there is no reason to penalize the second; it would be very strange to reject the null hypothesis for the first but not the second when they are identical. Second, suppose they are almost perfectly dependent. Then, rejecting the null for one strongly suggests that we should also reject the null for the second, so we do not want to penalize one very strongly for the rejection of the other. But as they become increasingly independent, the rejection of one provides less and less affirmative information about the other, and we approach the case of perfect independence shown above, which receives the greatest penalty. The specifics of the procedure are below.
Holm’s Correction to the p -values
The following code computes these p -values. Julia (apparently) lacks a command that would tell me the index of the rank of the p -values, so my loop below does this, including the handling of ties (when some of the p -values are the same):
This is straight-forward except for sort_index , which I constructed such that, e.g., if the first element of sort_index is 3 , then the first element of pvalues is the third smallest. Unfortunately, it arbitrarily breaks ties in favor of the parameter that appears first in the MLE array, so the second entry of sort_index is 1 and the third entry is 2 , even though the two corresponding p -values are equal .
Please email me or leave a comment if you know of a better way to handle ties.
Bootstrap Stepdown p -values
Given our work yesterday, it is relatively easy to replace bootstrap marginal p -values with bootstrap stepdown p -values. We begin with samples , as created yesterday. The following code creates tNullDistribution , which is the same as nullDistribution from yesterday except as t -statistics (i.e., divided by standard error).
The only difference between the single p -values and the stepdown p -values is the use of the maximum t -statistic in the comparison to the null distribution, and the maximum is taken over only the parameter estimates whose p -values have not yet been computed. Notice that I used a dictionary in the return so that I could output single, stepdown, and Holm p -values from the stepdown function.
Bradley J. Setzler
Bootstrapping and Non-parametric p-values in Julia
To perform bootstrapping, we rely on Julia’s built-in sample function.
Wrapper Functions and the Likelihood of Interest
Now that we have a bootstrap index, we define the log-likelihood as a function of x and y , which are any subsets of X and Y , respectively.
Then, if we wish to evaluate loglike across various subsets of x and y , we use what is called a wrapper, which simply creates a copy of loglike that has already set the values of x and y . For example, the following function will evaluate loglike when x=X and y=Y :
Tip: Use wrapper functions to manage arguments of your objective function that are not supposed to be accessed by the optimizer. Give the optimizer functions with only one argument — the parameters over which it is supposed to optimize your objective function.
Bootstrapping the OLS MLE
Now, we will use a random index, which is drawn for each b using the sample function, to take a random sample of individuals from the data, feed them into the function using a wrapper, then have the optimizer maximize the wrapper across the parameters. We repeat this process in a loop, so that we obtain the MLE for each subset. The following loop stores the MLE in each row of the matrix samples using 1,000 bootstrap samples of size one-half ( M ) of the available sample:
The resulting matrix contains 1,000 samples of the MLE. As always, we must remember to exponentiate the variance estimates, because they were stored in log-units.
Bootstrapping for Non-parametric p -values
Estimates of the standard errors of the MLE estimates can be obtained by computing the standard deviation of each column,
where the number 1 indicates that the standard deviation is taken over columns (instead of rows).
The non-parametric p-value (for two-sided hypothesis testing) is the fraction of times that the absolute value of the MLE is greater than the absolute value of the null distribution.
Thus, two-sided testing uses the absolute value ( abs ), and one-sided testing only requires that we choose the right comparison operator ( .> or .< ).
Results Let the true parameters be,
The resulting bootstrap standard errors are,
and the non-parametric two-sided p -value estimates are,
- Already have a WordPress.com account? Log in now.
- Subscribe Subscribed
- Report this content
- View site in Reader
- Manage subscriptions
- Collapse this bar
- By logging in you accept our terms of service and privacy policy
HypothesisTests Tag v0.6.0
Tag v0.6.0 toggle dropdown.
Hypothesis tests for Julia
Julia Julia
Documentation
Hypothesistests.jl.
This package implements several hypothesis tests in Julia.
Quick start
Some examples:
Full documentation available at Read the Docs .
The Tidelift Subscription provides access to a continuously curated stream of human-researched and maintainer-verified data on open source packages and their licenses, releases, vulnerabilities, and development practices.
Tagged Releases
See all contributors
Something wrong with this page? Make a suggestion
Export .ABOUT file for this package
Last synced: 2024-08-25 02:07:29 UTC
How do I find a p value with HypothesisTests.jl?
The documentation provides the example:
but how do I enter this ie.
isn’t accepted, so how do i put in that I want a two tailed sample? Unless I want the left or right tail.
The documented function signature is telling you that :both is the default value for the tail keyword argument, so if you don’t supply an alternative value, you will be getting the two-tailed p-value:
As an aside, I think I’ve mentioned it before but to reiterate it would be really good if you could copy/paste your own MWE into a fresh Julia session and run it before you post, to ensure that it actually reproduces the error you’re seeing. What you posted above doesn’t run as you’re missing a parenthesis at the end. Also note that when supplying keyword arguments you need to name them, i.e. tail = :both rather than just :both
MethodError: no method matching pvalue(::HypothesisTests.ApproximateTwoSampleKSTest)
Closest candidates are:
pvalue(!Matched::HypothesisTests.ApproximateOneSampleKSTest; tail) at /home/brett/Documents/BEST.jl#==#0979e6f0-e5f0-4c29-8b3b-cc2ef56a2241:1
- top-level scope @ Local: 1 [inlined]
MethodError: no method matching pvalue(::HypothesisTests.ApproximateTwoSampleKSTest; tail=:both)
MethodError: no method matching pvalue(::HypothesisTests.ApproximateTwoSampleKSTest; tail=:right)
top-level scope@Local: 1[inlined]
Does it matter that I’m using Pluto? how do i ensure I’m adding the latest version of packages in Pluto?
been said before, this is
I figured out that there was a conflict with other packages. There is also a problem that Pluto cuts stuff off.
Additionally, in your initial example, the code attempts to compute a p-value of a p-value, which of course, would be invalid.
Can I again suggest that for any problem you have, you try running the code in a fresh REPL session (NOT Pluto) to isolate where the issue comes from - from many of your posts I get the impression that the majority of your issues comes from incorrect usage of Pluto, rather than from the packages you’re using.
IJulia might also be a good alternative as it’s behaviour is pretty much like the plain REPL.
Yes, please do your homework before posting here @brett_knoss . We have all kinds of members in the forum with all levels of expertise willing to help. However, if you start to just dump any error you get here, people will treat it as SPAM.
Ok, I’ll try to test things more carefully. I’m having trouble using the HypothesisTests.jl tutorial, because it doesn’t provide examples with working variables. Other tutorials would benefit from this as well. I think this is one of the goals of Pluto, with sample notebooks.
Non-parametric hypothesis tests with examples in Julia
November 30, 2022
2022-11-30 First draft
Introduction
This article is an extension of Farmer. 2022. “Non-Parametric Hypothesis Tests with Examples in R.” November 18, 2022 . Please check out the parent article for the theoretical background.
- Wilcoxon rank-sum (Mann-Whitney U test) ( Section 3 )
- Wilcoxon signed-rank test ( Section 4 )
- Kruskal-Wallis test ( Section 5 )
Import packages
Data import and cleanup.
I have subsetted the data from 1928 onward and dropped any columns with all NAs or zeros. To do so for eachcol of data we first calculate whether all the elements are !ismissing && !=0 ( ! = not ). Then pick all rows for those columns, while disallowmissing data.
Row | Year | Argentina | Australia | Belgium | Brazil | Canada | Chile | China | Democratic Republic of the Congo | Denmark | Egypt | Finland | Italy | Japan | Mozambique | Norway | Peru | Portugal | Romania | Spain | Sweden | Turkey | USA | Global |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 1928 | 116.3 | 378.0 | 1505.0 | 43.61 | 868.6 | 54.57 | 39.78 | 21.81 | 385.2 | 43.8 | 138.1 | 1519.0 | 1897.0 | 7.27 | 158.3 | 25.42 | 36.34 | 163.6 | 763.2 | 233.2 | 29.07 | 15420.0 | 35616.0 |
2 | 1929 | 174.4 | 356.2 | 1606.0 | 47.25 | 963.0 | 72.62 | 76.51 | 29.07 | 396.1 | 87.22 | 138.1 | 1730.0 | 2112.0 | 10.9 | 159.1 | 25.43 | 43.61 | 156.3 | 901.3 | 284.0 | 32.71 | 15000.0 | 36873.0 |
3 | 1930 | 189.0 | 348.9 | 1508.0 | 43.58 | 926.7 | 79.88 | 73.45 | 32.71 | 385.2 | 148.5 | 101.6 | 1723.0 | 1853.0 | 10.9 | 160.2 | 10.9 | 47.29 | 196.3 | 908.4 | 304.6 | 29.07 | 14290.0 | 35561.0 |
4 | 1931 | 265.3 | 196.3 | 1218.0 | 83.59 | 799.5 | 50.88 | 97.93 | 21.81 | 250.8 | 119.9 | 79.95 | 1519.0 | 1788.0 | 10.9 | 109.7 | 14.54 | 47.25 | 98.14 | 806.8 | 257.9 | 50.83 | 10810.0 | 30931.0 |
5 | 1932 | 247.1 | 123.6 | 1039.0 | 72.69 | 363.4 | 54.51 | 79.57 | 7.27 | 203.6 | 119.9 | 76.23 | 1545.0 | 1843.0 | 10.9 | 117.3 | 10.9 | 58.15 | 105.4 | 705.1 | 241.0 | 54.46 | 6656.0 | 24721.0 |
6 | 1933 | 254.5 | 159.9 | 963.1 | 109.0 | 189.0 | 69.05 | 113.2 | 3.63 | 272.6 | 141.7 | 80.05 | 1756.0 | 2366.0 | 10.18 | 110.8 | 14.53 | 79.95 | 109.0 | 694.1 | 200.7 | 58.15 | 5587.0 | 23866.0 |
7 | 1934 | 279.8 | 207.2 | 937.6 | 159.9 | 272.6 | 101.7 | 97.93 | 3.63 | 381.6 | 145.4 | 112.7 | 2013.0 | 2304.0 | 7.27 | 123.6 | 21.82 | 90.94 | 156.3 | 672.3 | 287.1 | 83.59 | 6900.0 | 28542.0 |
8 | 1935 | 356.3 | 276.2 | 1087.0 | 181.6 | 272.6 | 141.6 | 156.1 | 3.63 | 374.3 | 189.0 | 134.3 | 2086.0 | 2904.0 | 7.27 | 130.8 | 29.07 | 105.4 | 189.0 | 668.7 | 367.1 | 65.42 | 6684.0 | 32090.0 |
9 | 1936 | 410.7 | 323.5 | 1163.0 | 239.7 | 388.9 | 123.5 | 428.5 | 3.63 | 392.4 | 167.2 | 163.5 | 1890.0 | 3082.0 | 7.27 | 149.0 | 36.34 | 119.9 | 185.3 | 297.9 | 392.5 | 69.05 | 9950.0 | 38763.0 |
10 | 1937 | 512.3 | 363.4 | 1486.0 | 283.5 | 483.4 | 156.3 | 437.7 | 3.77 | 334.4 | 163.5 | 203.4 | 2155.0 | 2984.0 | 7.27 | 159.9 | 39.96 | 127.1 | 225.3 | 189.0 | 432.5 | 105.4 | 10370.0 | 40829.0 |
11 | 1938 | 614.3 | 178.1 | 1508.0 | 305.5 | 432.5 | 181.7 | 9.18 | 7.5 | 316.2 | 185.3 | 236.2 | 2279.0 | 2729.0 | 11.99 | 163.6 | 50.86 | 130.8 | 221.7 | 294.4 | 490.6 | 130.9 | 9239.0 | 38551.0 |
12 | 1939 | 556.0 | 338.0 | 1261.0 | 345.5 | 450.7 | 167.2 | 223.4 | 17.44 | 345.3 | 185.0 | 279.6 | 2526.0 | 2508.0 | 14.54 | 192.6 | 58.18 | 145.4 | 261.7 | 588.8 | 585.1 | 141.7 | 10790.0 | 35687.0 |
13 | 1940 | 534.4 | 348.8 | 105.4 | 367.1 | 592.4 | 189.0 | 272.4 | 11.45 | 218.1 | 178.1 | 149.0 | 2373.0 | 2101.0 | 14.54 | 167.2 | 61.78 | 134.5 | 196.3 | 770.5 | 345.3 | 130.9 | 11550.0 | 31431.0 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
83 | 2010 | 4178.0 | 3549.0 | 2582.0 | 21290.0 | 6005.0 | 1046.0 | 639600.0 | 189.0 | 672.2 | 20510.0 | 525.7 | 13280.0 | 24320.0 | 340.9 | 754.0 | 3338.0 | 3376.0 | 2778.0 | 11200.0 | 1324.0 | 29980.0 | 31450.0 | 1.2549e6 |
84 | 2011 | 4586.0 | 3496.0 | 2762.0 | 22840.0 | 6020.0 | 1080.0 | 708600.0 | 175.2 | 861.8 | 20100.0 | 557.8 | 12580.0 | 24980.0 | 373.3 | 749.0 | 3300.0 | 2813.0 | 3089.0 | 9523.0 | 1361.0 | 31450.0 | 32210.0 | 1.3498e6 |
85 | 2012 | 4184.0 | 3518.0 | 2643.0 | 25000.0 | 6532.0 | 1128.0 | 714800.0 | 157.9 | 871.1 | 20970.0 | 497.2 | 10070.0 | 25620.0 | 452.7 | 725.0 | 3731.0 | 2550.0 | 3150.0 | 8754.0 | 1479.0 | 31370.0 | 35270.0 | 1.3846e6 |
86 | 2013 | 4581.0 | 3294.0 | 2541.0 | 26650.0 | 5973.0 | 1086.0 | 748300.0 | 174.0 | 867.1 | 20210.0 | 481.8 | 8877.0 | 26810.0 | 505.6 | 731.0 | 4257.0 | 2814.0 | 2695.0 | 7642.0 | 1402.0 | 33910.0 | 36370.0 | 1.4441e6 |
87 | 2014 | 4336.0 | 3138.0 | 2643.0 | 26910.0 | 5912.0 | 1022.0 | 778600.0 | 127.9 | 887.3 | 20760.0 | 468.8 | 8339.0 | 26560.0 | 585.9 | 727.0 | 4590.0 | 3096.0 | 2944.0 | 8897.0 | 1399.0 | 34500.0 | 39440.0 | 1.4999e6 |
88 | 2015 | 4571.0 | 3076.0 | 2348.0 | 25080.0 | 6185.0 | 1033.0 | 722000.0 | 154.6 | 931.5 | 21650.0 | 462.1 | 8196.0 | 25940.0 | 614.2 | 672.0 | 4476.0 | 2921.0 | 3337.0 | 9216.0 | 1537.0 | 34440.0 | 39910.0 | 1.4444e6 |
89 | 2016 | 4029.0 | 2931.0 | 2436.0 | 22420.0 | 6114.0 | 1120.0 | 743000.0 | 98.04 | 1095.0 | 22820.0 | 553.2 | 7680.0 | 25970.0 | 947.8 | 684.0 | 4340.0 | 2297.0 | 3181.0 | 9414.0 | 1554.0 | 37530.0 | 39440.0 | 1.4876e6 |
90 | 2017 | 4362.0 | 3019.0 | 2291.0 | 19080.0 | 6827.0 | 865.9 | 758200.0 | 348.7 | 1194.0 | 21770.0 | 603.7 | 7711.0 | 26430.0 | 910.6 | 766.0 | 4291.0 | 2531.0 | 3310.0 | 9449.0 | 1484.0 | 39470.0 | 40320.0 | 1.5079e6 |
91 | 2018 | 4369.0 | 2942.0 | 2534.0 | 19340.0 | 6915.0 | 782.2 | 786700.0 | 406.1 | 1160.0 | 21000.0 | 601.7 | 7757.0 | 26180.0 | 930.0 | 730.0 | 4320.0 | 2251.0 | 3505.0 | 9667.0 | 1607.0 | 39410.0 | 38970.0 | 1.5692e6 |
92 | 2019 | 4141.0 | 3040.0 | 2819.0 | 19860.0 | 7125.0 | 825.3 | 826900.0 | 451.0 | 1129.0 | 19670.0 | 583.5 | 7912.0 | 25330.0 | 1011.0 | 722.0 | 4546.0 | 2225.0 | 3828.0 | 9064.0 | 1349.0 | 32350.0 | 40900.0 | 1.6175e6 |
93 | 2020 | 3508.0 | 2820.0 | 2634.0 | 22050.0 | 6625.0 | 825.3 | 858200.0 | 451.0 | 1227.0 | 18130.0 | 569.7 | 7059.0 | 24490.0 | 1011.0 | 725.0 | 4546.0 | 2310.0 | 3901.0 | 8192.0 | 1272.0 | 40810.0 | 40690.0 | 1.6375e6 |
94 | 2021 | 4671.0 | 2820.0 | 2634.0 | 23790.0 | 6625.0 | 825.3 | 853000.0 | 451.0 | 1227.0 | 16160.0 | 569.7 | 7059.0 | 23790.0 | 1011.0 | 701.3 | 4546.0 | 2310.0 | 3901.0 | 8609.0 | 1272.0 | 44390.0 | 41200.0 | 1.6729e6 |
Wilcoxon rank-sum (Mann-Whitney U test)
Right tailed test, wilcoxon signed-rank test, kruskal-wallis test.
juliabloggers.com
A julia language blog aggregator, tag archives: hypothesis testing, stepdown p-values for multiple hypothesis testing in julia.
By: Bradley Setzler
Re-posted from: https://juliaeconomics.com/2014/06/17/stepdown-p-values-for-multiple-hypothesis-testing-in-julia/
* The script to reproduce the results of this tutorial in Julia is located here .
To finish yesterday’s tutorial on hypothesis testing with non-parametric p -values in Julia, I show how to perform the bootstrap stepdown correction to p -values for multiple testing, which many economists find intimidating (including me) but is actually not so difficult.
First, I will show the simple case of Holm’s (1979) stepdown procedure. Then, I will build on the bootstrap example to apply one of the most exciting advances in econometric theory from the past decade, the bootstrap step-down procedure, developed by Romano and Wolf (2004) and extended by Chicago professor Azeem Shaikh . These methods allow one to bound the family-wise error rate, which is the probability of making at least one false discovery (i.e., rejecting a null hypothesis when the null hypothesis was true).
The Romano-Wolf-Shaikh approach is to account for the dependence among the estimators, and penalize estimators more as they become more independent. Here’s how I motivate their approach: First, consider the extreme case of two estimators that are identical. Then, if we reject the null hypothesis for the first, there is no reason to penalize the second; it would be very strange to reject the null hypothesis for the first but not the second when they are identical. Second, suppose they are almost perfectly dependent. Then, rejecting the null for one strongly suggests that we should also reject the null for the second, so we do not want to penalize one very strongly for the rejection of the other. But as they become increasingly independent, the rejection of one provides less and less affirmative information about the other, and we approach the case of perfect independence shown above, which receives the greatest penalty. The specifics of the procedure are below.
Holm’s Correction to the p -values
The following code computes these p -values. Julia (apparently) lacks a command that would tell me the index of the rank of the p -values, so my loop below does this, including the handling of ties (when some of the p -values are the same):
This is straight-forward except for sort_index , which I constructed such that, e.g., if the first element of sort_index is 3 , then the first element of pvalues is the third smallest. Unfortunately, it arbitrarily breaks ties in favor of the parameter that appears first in the MLE array, so the second entry of sort_index is 1 and the third entry is 2 , even though the two corresponding p -values are equal .
Please email me or leave a comment if you know of a better way to handle ties.
Bootstrap Stepdown p -values
Given our work yesterday, it is relatively easy to replace bootstrap marginal p -values with bootstrap stepdown p -values. We begin with samples , as created yesterday. The following code creates tNullDistribution , which is the same as nullDistribution from yesterday except as t -statistics (i.e., divided by standard error).
The only difference between the single p -values and the stepdown p -values is the use of the maximum t -statistic in the comparison to the null distribution, and the maximum is taken over only the parameter estimates whose p -values have not yet been computed. Notice that I used a dictionary in the return so that I could output single, stepdown, and Holm p -values from the stepdown function.
Bradley J. Setzler
Bootstrapping and Non-parametric p-values in Julia
Re-posted from: http://juliaeconomics.com/2014/06/16/bootstrapping-and-hypothesis-tests-in-julia/
Unfortunately, Julia does not have built-in bootstrapping yet, but I found a way to do it in one line with the randperm function.
Wrapper Functions and the Likelihood of Interest
Now that we have a bootstrap index, we define the log-likelihood as a function of x and y , which are any subsets of X and Y , respectively.
Then, if we wish to evaluate loglike across various subsets of x and y , we use what is called a wrapper, which simply creates a copy of loglike that has already set the values of x and y . For example, the following function will evaluate loglike when x=X and y=Y :
Tip: Use wrapper functions to manage arguments of your objective function that are not supposed to be accessed by the optimizer. Give the optimizer functions with only one argument — the parameters over which it is supposed to optimize your objective function.
Bootstrapping the OLS MLE
Now, we will use a random index, which I created from the randperm function, to take a random subset of individuals from the data, feed them into the function using a wrapper, then have the optimizer maximize the wrapper across the parameters. We repeat this process in a loop, so that we obtain the MLE for each subset. The following loop stores the MLE in each row of the matrix samples using 1,000 bootstrap samples of size one-half ( M ) of the available sample:
The resulting matrix contains 1,000 samples of the MLE. As always, we must remember to exponentiate the variance estimates, because they were stored in log-units.
Bootstrapping for Non-parametric p -values
Estimates of the standard errors of the MLE estimates can be obtained by computing the standard deviation of each column,
where the number 1 indicates that the standard deviation is taken over columns (instead of rows).
The non-parametric p-value (for two-sided hypothesis testing) is the fraction of times that the absolute value of the MLE is greater than the absolute value of the null distribution.
Thus, two-sided testing uses the absolute value ( abs ), and one-sided testing only requires that we choose the right comparison operator ( .> or .< ).
The resulting bootstrap standard errors are,
and the non-parametric two-sided p -value estimates are,
Statistics and Machine Learning made easy in Julia.
- Easy to use tools for statistics and machine learning.
- Extensible and reusable models and algorithms
- Efficient and scalable implementation
- Community driven, and open source
Basic functionalities for statistics
- Descriptive statistics and moments
- Sampling with/without replacement
- Counting and ranking
- Autocorrelation and cross-correlation
- Weighted statistics
StatsModels
Interfaces for statistical models
- Formula and model frames
- Essential functions for statistical models
Essential tools for tabular data
- DataFrames to represent tabular datasets
- Database-style joins and indexing
- Split-apply-combine operations, pivoting
Distributions
Probability distributions
- A large collection of univariate, multivariate distributions
- descriptive stats, pdf/pmf, and mgf
- Efficient sampling
- Maximum likelihood estimation
MultivariateStats
Multivariate statistical analysis
- Linear regression (LSQ and Ridge)
- Dimensionality reduction (PCA,CCA,ICA,...)
- Multidimensional scaling
- Linear discriminant analysis
HypothesisTests
Hypothesis tests
- Parametric tests: t-tests
- Nonparametric tests: binomial tests, sign tests, exact tests, U tests, rank tests, etc
Swiss knife for machine learning
- Data preprocessing
- Score-based classification
- Performance evaluation
- Model selection, cross validation
Various distances between vectors
- A large variety of metrics
- Efficient column-wise and pairwise computation
- Support weighted distances
KernelDensity
Kernel density estimation
- Kernel density estimation for univariate and bivariate data
- User customization of interpolation points, kernel, and bandwidth
Algorithms for data clustering
- Affinity propagation
- Evaluation of clustering performance
Generalized linear models
- Friendly API for fitting GLM to data
- Work with data frames and formulas
- A variety of link types
- Optimized implementation
Nonnegative matrix factorization
- A variety of NMF algorithms, including Lee & Seung's, Projected ALS and projected gradient, with optimized implementation.
- NNDSVD initialization
Lasso/Elastic Net linear and generalized linear models
- glmnet coordinate descent algorithm
- Polynomial trend filtering
- O(n) fused Lasso
- Gamma Lasso (a concave regularization path glmnet variant)
Time series analysis
- Tools to represent, manipulate, and apply computation to time series data
Forum : Statistics topic on the Julia Discourse
Github page : https://github.com/JuliaStats
HypothesisTests package
This package implements several hypothesis tests in Julia.
- Confidence interval
- Parametric tests
- Power divergence test
- Pearson chi-squared test
- Multinomial likelihood ratio test
- Nonparametric tests
- Anderson-Darling test
- Binomial test
- Fisher exact test
- Kolmogorov-Smirnov test
- Kruskal-Wallis rank sum test
- Mann-Whitney U test
- Wald-Wolfowitz independence test
- Wilcoxon signed rank test
- Permutation test
- Time series tests
- Durbin-Watson test
- Box-Pierce and Ljung-Box tests
- Breusch-Godfrey test
- Jarque-Bera test
- Augmented Dickey-Fuller test
- Multivariate tests
- Hotelling's $T^2$ test
- Equality of covariance matrices
- Partial correlation test
IMAGES
COMMENTS
HypothesisTests package. This package implements several hypothesis tests in Julia. Methods. Confidence interval. p-value. Parametric tests. Power divergence test. Pearson chi-squared test. Multinomial likelihood ratio test.
Hypothesis tests for Julia. Contribute to JuliaStats/HypothesisTests.jl development by creating an account on GitHub.
Perform a two-sample z-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
Box-Pierce and Ljung-Box tests HypothesisTests.BoxPierceTest — Type BoxPierceTest(y, lag, dof= 0) Compute the Box-Pierce Q statistic to test the null hypothesis of independence in a time series y. lag specifies the number of lags used in the construction of Q. When testing the residuals of an estimated model, dof has to be set to the number of estimated parameters. E.g., when testing the ...
HypothesisTests.jl is a Julia package that implements a wide range of hypothesis tests. Build & Testing Status: Documentation:
Hypothesis tests for Julia. Contribute to JuliaStats/HypothesisTests.jl development by creating an account on GitHub.
BayesTesting.jl implements a fresh approach to hypothesis testing in Julia. The Jeffreys-Lindley-Bartlett paradox does not occur. Any prior can be employed, including uninformative and reference priors, so the same prior employed for inference can be used for testing, and objective Bayesian posteriors can be used for testing.
The main difference between LR test and Wald is that the Wald test allows for testing any number of linear combinations and hypothesis tests including incorporating the information from the variance covariance estimates all with a single fitted model. The likelihood ratio test is more powerful but requires re-fitting the model for each combination.
Anderson-Darling test Available are both one-sample and k k -sample tests. HypothesisTests.OneSampleADTest — Type OneSampleADTest(x:: AbstractVector {<: Real }, d::UnivariateDistribution) Perform a one-sample Anderson-Darling test of the null hypothesis that the data in vector x come from the distribution d against the alternative hypothesis that the sample is not drawn from d. Implements ...
Using pvalue() we can further interrogate the p-values generated by these tests. The values reported in the output above are for the two-sided test, but we can look specifically at values associated with either the left- or right tails of the distribution. This makes the outcome of the test a lot more specific. julia> pvalue(t1) 0.6842692696393744
I am taking a statistical data science course that is taught using R, and I am trying to replicate all the practical things in Julia. As we are doing quite basic stuff, mostly everything has been quite straightforward and similar to R, but I have been having some problems with hypothesis testing. The package I am using is HypothesisTests.jl. I don't exactly understand the implementation of ...
HypothesisTests package. This package implements several hypothesis tests in Julia. Methods. Confidence interval. p-value. Parametric tests. Power divergence test. Pearson chi-squared test. Multinomial likelihood ratio test.
To finish yesterday's tutorial on hypothesis testing with non-parametric p -values in Julia, I show how to perform the bootstrap stepdown correction to p -values for multiple testing, which many economists find intimidating (including me) but is actually not so difficult.
HypothesisTests.jl This package implements several hypothesis tests in Julia.
The documented function signature is telling you that :both is the default value for the tail keyword argument, so if you don't supply an alternative value, you will be getting the two-tailed p-value: julia> using HypothesisTests julia> x = randn (100_000); y = randn (100_000); julia> pvalue (ApproximateTwoSampleKSTest (x, y)) 0. ...
Methods This page documents the generic confint and pvalue methods which are supported by most tests. Some particular tests support additional arguments: see the documentation for the relevant methods provided in sections covering these tests.
A tutorial on non-parametric hypothesis tests with examples in Julia.
To finish yesterday's tutorial on hypothesis testing with non-parametric p -values in Julia, I show how to perform the bootstrap stepdown correction to p -values for multiple testing, which many economists find intimidating (including me) but is actually not so difficult. First, I will show the simple case of Holm's (1979) stepdown procedure.
To finish yesterday's tutorial on hypothesis testing with non-parametric p -values in Julia, I show how to perform the bootstrap stepdown correction to p -values for multiple testing, which many economists find intimidating (including me) but is actually not so difficult.
HypothesisTests Hypothesis tests Parametric tests: t-tests Nonparametric tests: binomial tests, sign tests, exact tests, U tests, rank tests, etc
Julia Toolkit with fairness metrics and bias mitigation algorithms ScoreDrivenModels.jl 27 Score-driven models, aka generalized autoregressive score models, in Julia ... Hypothesis tests of calibration. CancerSeqSim.jl 6 - LRMoE.jl 6 LRMoE implemented in Julia SFrontiers.jl 6 A Package for Estimating Stochastic Frontier Models using Julia ...
HypothesisTests package. This package implements several hypothesis tests in Julia. Methods. Confidence interval. p-value. Parametric tests. Power divergence test. Pearson chi-squared test. Multinomial likelihood ratio test.