These things have definitely been discussed before, so it's time to get
them prioritized. CC-ed Toby directly so he can follow up:
1. wikimetrics should allow user_name to be the key in report outputs.
Right now, only user_id is allowed and this is not great. LiAnna, Jaime,
and Jessie are definitely interested in this, and have mentioned it a few
times.
2. wikimetrics should allow "generated cohorts" as implemented by user
metrics api. These are cohorts defined by reports on other cohorts. For
example, if we run report R on cohort C, then generated cohort (GC) would
be: GC = {user | user in C and R(user) is true}. Dario is definitely
interested in this, and Jaime might be as well.
On Tue, Nov 26, 2013 at 6:59 PM, LiAnna Davis <[email protected]> wrote:
> I would LOVE it if the output gave user names instead of user IDs. Often
> the data makes me want to investigate the individual stories of
> contributors who added a lot of content/made a lot of edits/etc., but
> there's no way of doing that with user IDs since I can't convert user IDs
> to usernames.
>
>
>
>
> On Tue, Nov 26, 2013 at 2:46 PM, Dario Taraborelli <
> [email protected]> wrote:
>
>> thanks for the clarification Jaimee – it sounds like we should consider
>> adding user_names to the output if this is the main cause of the problem
>> instead of building functionality at the input to deal with this. Dan, any
>> thoughts?
>>
>> BTW this notion of rerunning cohort analysis for members of a previous
>> cohort who meet specific criteria is a use case that Product/Editor
>> Engagement is also interested in. We used to call these “generated cohorts”
>> in the old design plans for UserMetrics and I’d love if we revisited this
>> feature requests and its relative priority.
>>
>> D
>>
>> On Nov 26, 2013, at 2:35 PM, Jaime Anstee <[email protected]> wrote:
>>
>> Missed the question back to me, sorry. Mixed cohorts might occur due to
>> the output as user IDs while collection is of usernames - say someone has a
>> repeating events and has a csv output of data for those new users that were
>> retained at a certain activity level from Point A to B and then has new
>> cohort members opt in at Point B but only wants to include those that
>> already survived from Point A and new at Point B cohort members for
>> examining at another Point C. Without the output of usernames to create
>> the active Point B cohort separately this would make the Point C cohort a
>> mix of qualified user ids and new user names. There are several ways of
>> dealing with this, it was just the first scenario I could think of that
>> could cause this. Seems we still need to revisit the possibility of
>> accessing usernames as output, also for reasons of matching to other data
>> points where most users and most program leaders do not know user ids -
>> Jaime
>>
>> --
>>
>> Jaime Anstee, Ph.D
>> Program Evaluation Specialist
>> Wikimedia Foundation
>> +1.415.839.6885 ext 6869
>> www.wikimediafoundation.org
>>
>> Imagine a world in which every single human being can freely share in the
>> sum of all knowledge. Help us make it a reality!
>> *https://donate.wikimedia.org <https://donate.wikimedia.org/>*
>>
>>
>>
>> On Fri, Nov 22, 2013 at 4:04 PM, Dario Taraborelli <
>> [email protected]> wrote:
>>
>>> that works for me, thanks!
>>>
>>> Jaimee – can you give us more details on the use case for mixed cohorts
>>> that you had in mind?
>>>
>>> On Nov 22, 2013, at 3:28 PM, Dan Andreescu <[email protected]>
>>> wrote:
>>>
>>>
>>>> So, for now, until I figure out how to fix this, it will always prefer
>>>>> user_names before user_ids.
>>>>
>>>>
>>>> I think this is an argument for making users specifying whether it's
>>>> names or ids up front, and not allowing mixtures. Assuming it might be a
>>>> mixture and looking for names first is almost certain to produce inaccurate
>>>> results at some point. We have ids precisely to avoid collisions with
>>>> names, allowing for renaming users, and other cases.
>>>>
>>>
>>> Yep, I just learned this the hard way and made a fool of myself in front
>>> of a bunch of people I admire. So, I'd be glad if I'm the only one that
>>> this happens to. If nobody objects, I'm going to allow the user to select
>>> whether their cohort contains user_ids OR user_names, and strictly prohibit
>>> mixtures.
>>>
>>> _______________________________________________
>>> Wikimetrics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>
>>>
>>>
>>> _______________________________________________
>>> Wikimetrics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>
>>>
>> _______________________________________________
>> Wikimetrics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>
>>
>>
>> _______________________________________________
>> Wikimetrics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>
>>
>
>
> --
> LiAnna Davis
> Wikipedia Education Program Communications Manager
> Wikimedia Foundation
> http://education.wikimedia.org
> (415) 839-6885 x6649
> [email protected]
>
> _______________________________________________
> Wikimetrics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>
>
_______________________________________________
Wikimetrics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikimetrics