***ALERT NOTE***
SAS Note: SN-003570
Product: ENT MINER
Component: DATA.PART
Title:
Partition node does not work correctly with stratification
Text:
If you are using the Data Partition Node and also request
Stratification, the node may not produce training, validation, and test
data sets with the data percentages you request.
The impact of this bug is greater when the target level of interest is
rare.
For example, suppose there are 10 observations out of 100,000 with
stratification variable level A and you have requested a 70% training,
20% validation and 10% test data set split. The Partition node should
place 7 level A observations in the training data set, 2 in the
validation data set, and 1 in the test data set. Because of the bug,
these placements are not made correctly, and there is a chance that one
or more of the partitioned data sets might not have level A at all.
This bug occurs at random. There is no way to tell if a given scenario
will encounter the bug or not.
There are no notes or warnings produced. The only way that you can tell
there is a problem would be to do a frequency analysis on the
partitioned data sets.
There is no circumvention.
Keywords:
INCORROUT INCORRSTAT PARTITION STRATIFIY STRATIFICIATION REG NEURAL TREE
PREDICTIVE MODELING MODEL
Reported Releases:
+------------REPORTED------------+ +------FIXED-----+
System Version Level Version Level
AIX/R 4.0
DigitUnx 4.0
HP800 4.0
Solaris 4.0
WIN/NT 4.0
Win95 4.0
Fixes Available: None
+-------------------------------------------------------------------------
+ NOTE: To unsubscribe you can reply to this mail with:
+ "SIGNOFF tsnews-l"
+ as the only text in the body of the message (without the double quotes).
+ If you have any questions please send them to SUPPORT@SAS.COM
+-------------------------------------------------------------------------
This archive was generated by hypermail 2b29 : Tue Nov 14 2000 - 15:46:53 EST